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I. INTRODUCTION 



5. BACKGROUND 

The two dimensional scatter plot has teen hailed by many 
statisticians as being the single most powerful tool used in 
exploratory data analysis, [Ref. 1 ]. A scatter plot pres- 
ents an entire data set in a compact, unambiguous and easily 
understandable format, in which either: 

1. the points lie in a nearly straight line; 

2. the points almost lie on a smooth curve; 

3. the points are scattered without any apparent corre- 
lation between the X variables and the Y variables; 

4. the points lie somewhere between (1) or (2) and (3); 

5. most of the points lie near a straight line or smooth 
curve but a few outliers are separated from the rest. 
[Ref. 2] 

These patterns or other hidden peculiarities are much easier 
to discover during a brief glimpse at a well prepared 
scatter plot than during an examination of a data table, for 
example, the strong positive correlation between total users 
and active users logged on to the W.R. Church computer 
system. Figure 1.1, is more easily discerned from the 
plotted points than from the tabulated data 1 . This is a 
good example of case (1) , described above. 

Not only does this plot point out the positive trend in 
the data, it also demonstrates that it is nearly linear and 
provides a rough estimate of the relationship between the 
variables. 



1 The table in Figure 1.1 contains only a small portion of 
the 472 data points included in the plot. A complete listing 
of the data set takes approximately two pages of text ar.d is 
not required for demonstration purposes. 
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Figure 1.1 Comparison of Data Presentation Hethods. 

More precise mathematical expressions and confirmatory 
procedures, including goodness of fit measures, can be 
obtained by employing classical regression analysis tech- 
niques, a logical enhancement of simple scatter plots. 
Figure 1.2. Numerical quantifications such as the Pearson 
product moment correlation also provide summaries tut can be 
ambiguous if not accompanied by other information, [Ref. 1 , 
P 77]. 

Scatter plots are not invulnerable to misinterpretation. 
Nhen the scatter of points falls into category (4) or (5) , 
as in Figure 1.3, it may not be possible to judge the true 
relationship between the variables during a quick glance at 
the scatter plot, although there obviously is some relation- 
ship. Figure 1.3 contains a plot of the first 200 points of 
test set two (Appendix C) which is used in Chapter III, 
Section 2 to test LCKESS* ability to follow abrupt changes 
in curvature. 



1 1 




Figure 1.2 Linear Least Squares Regression of 
Active Users on Total Users Logged on to the 
W.R. Church Computer System. 




Figure 1.3 Scatter Plot of the First 200 Points 

of Test Set Two. 

Initial inspection of this data suggests the presence of 
a quadratic type pattern. This impression leads naturally to 
using the quadratic least squares regression line of Figure 
1.4 to describe the dependence of Y on X. The accompanying 
analysis of variance table lends some support to this 
choice, since r 2 = .709. 

A closer examination of this data reveals, however, that 
although it locks guadratic, the actual dependence of Y on X 
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Y = +/C x X • 0 1 2 WHERE: C = -0.26565 0.5*139 -0 013564 



analysis or variance table 



solpce 


SS 


DF 


MS 


GRAND LCAN (SEE NOTE) 


2215.056 


1 




REGRESSION 


523 637 


2 


261.618 


residual 


215.312 


197 


1 093 


TOTAL 


2954.005 


200 


14.770 


THE SIGNIFICANCE LEVEL Of RfCRESSICN - 


.0000 





(SIGNIFICANCE LEVEL - APE A UNDER CURVE BtYtt© COMMUTED F) 

R SOUAPE (SEE NOTE) - .709 

I NOTE: ito WEIGHTED CASE. SEE DESCRIPTION FOR LEANING 

Figure 1.4 Quadratic Regression on the First 200 
Points of Test Set Two. 

is not described guite that simply. Figure 1.5 demonstrates 
this point very clearly. Splitting the data set into three 
parts at what appear to be logical break points, (x=10,25), 
and fitting a linear least squares regression line to each, 
shows that Y is not a single function of X over its entire 
range. In fact, there appear to be three separate linear 
trends in this data. 

Analyses of this type are seldom undertaken because of 
the tedium involved in selecting appropriate splitting 
points once it has teen determined that doing so may be 
helpful. 

How then, can an analyst discover the existence of 
subtle trends or define the shape of unusual patterns 
contained in a scatter plot? The answer is to use local 
smoothing procedures rather than global (regression) fitting 
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Figure 1.5 Linear Begressions on First 200 Points of 
Test Set Two Split at X = 10 and 25. 

techniques. Using a flexible smoothing procedure that 
responds to local changes in the data structure allows the 
data itself to determine the shape of the final curve, as 
opposed to the classical approach of fitting polynomials 
which have predetermined shapes. 

The Robust Locally Weighted Regression and Scatterplot 
Smoothing (LOWESS) procedure, [fief. 3], described in the 
remainder of this paper, is a very good method for 
preventing the acceptance of assumptions like the one that 
led to using the quadratic model in Figure 1.4. The LCWESS 
smoothing technique applied to this data, the right hand 
plot of Figure 1.6, shows very clearly, that the dependence 
of Y on X resembles a combination of three distinct linear 
functions (the parameter F = -25 will be explained later). 
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The LOWESS smoothing process has a tendency to round angular 
corners. The straight lines in the center of each segment 
suggest linear trends similar to those contained in Figure 

1.5. 

The major problem with trying to use polynomials to 
depict subtle trends cr to describe unusual relationships in 
a data set, is that they are neither flexible nor local. By 
way of example, the points on either extreme of the first of 
the twc plots in Figure 1.6, have a significant affect on 
the middle of the fitted polynomials. 



QUADRATIC REGRESSION 10WESS F - .25 





Figure 1.6 Comparison of a Quadratic Regression and LOWESS 
Smoothing (F = .25) on First 200 Points of Test Set Two. 
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B. SCOPE 



Locally Weighted Regression and 
(LOWESS) , introduced by William S 
[Ref. 3], is a generalized extension 



Scatterplot Smoothing 
Cleveland in 1977, 
of the locally fitted 
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polynomial smoothing techniques used for many years in the 
field of time series i analysis. 

The essential idea behind the simplest of these clas- 
sical smoothing techniques is the following. If the data 
points (Xi,Yi) come from an additive model of the form 

y, = g(x,) + e, 

2 

where E (£i) = 0 and Var (€i) = C and G ( Xi ) can be approxi- 

mated locally, over the interval i-m, . . . i , i+ 1 , . . . i + m , by the 
linear function 

Y, = B 0 (X,) + B ,(X ,) x X,+ 6, 



then averaging the Yi over this range yields 

M 



Y. = 



1 






i 2M+ l t—> ' i+j 

J--M 



where 



E(Y,) = B fl (X,) + B/X,) x X, + € , 



VAR(Y,) =VAR(€,) 




2M+1 



If the assumption that the €i are uncorrelated is true, then 

A 

this moving average process produces estimated Yi’s that are 
unbiased and have smaller variance than the raw Yi's. This 
technique makes it easier to distinguish G { Xi ) through the 
noise (6i) . Using a bandwidth, M, larger than the interval 



i k time series is a sequence of random variables Yi which 
are naturally ordered by time (i) and can therefore be 
presented as a scatter plot of Yi versus i. Although i is 
usually the integers, missing values can occur. 
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will introduce 



over which the linearity assumption holds, 
lias into the results. [Ref. 4] 

The purpose of this thesis is to translate the generali- 
zation of classical smooting techniques proposed by 
Cleveland [Ref. 3], and expounded upon by Chambers et al 
[Ref. 1], into user friendly computer programs available for 
use as exploratory data analysis tools by students and 
faculty cf the Naval Postgraduate School. 

LCWESS, written in APL, an acronym for "A PROGRAMMING 
LANGUAGE, ” was designed to be used alone or in conjunction 
with the IBM GRAPSTAT statistical graphics package. 
GRAFSTAT, an experimental program, currently under develop- 
ment by the IBM Watscn Reaearch Center, is available at the 
Naval Postgraduate School for test and evaluation purposes 
[Ref. 5]. All graphs contained in this paper were produced 
by the GENERAL PLOT function of the GRAFSTAT program. 

LCWS, a modification of LOWESS, when used in conjunction 
with GRAFSTAT and expanded versions of the DRAFTSMAN DISPLAY 
programs described in [Ref. 6], enhances an already powerful 
exploratory data analysis package. 

A FORTRAN version of the basic LOWESS program was 
designed to be used in conjunction with either DISPLA 
[Ref. 7], or any other W.R. Church computer system supported 
graphing package. 

These programs are interactive and can be used easily by 
individuals who have little or no APL or FORTRAN programming 
skills. Users who are well versed in these languages should 
be able tc modify them to provide tailor made outputs, 
expand their capabilities or incorporate them into ether 
analysis packages. 

Detailed user instructions are contained in Chapters IV 
and V while examples of their use are presented in Chapter 
III. Users who are interested in the mathematical details 
of Robust Locally Weighted Regression and Scatterplot 
Smoothing should read Chapter II. 
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II. TECHNICAL DESCRIPTION OF LOW ESS 



A. OVERVIEW 

Locally Weighted Regression Scatterplot Smoothing 
(LOWESS), is a generalized extension of locally fitted poly- 
nomial smoothing techniques used by many statisticians in 
time series analysis 1 . Dnl ike its p redecess ors , ho wever , 

.L2EISS was designe d to work on unequally as well as eq ually 
sp ace d X *s. It also contains a robust fitting procedure 
that guards against possible distortion of the smoothed 
curve by outlier points. The general procedure used by 
Cleveland is an adaptation of iterated least squares regres- 
sion techniques developed by Albert Beaton and John Tukey 
[Ref. 8]. 

The overall objective of LOWESS, like most smoothing or 

A 

regression routines, is to compute a "fitted" value, Y, that 

depicts the middle of the empirical distribution of Y at 

each X. Unfortunately, most data sets do not contain enough 

repeated observations at each X to provide a good estimate 

of the middle of this distribution. LOWESS derives its esti- 
A 

mate of Y from the equation of a weighted least squares 
regression line fitted to a set of data points whose X 
values are located in a user defined neighborhood about Xi 
(X value of the point being smoothed) . 

E. MATHEMATICAL DETAILS: NON-ROBUST LOWESS SMOOTHING 

The first step in generating a LOWESS smoothed point 
consists of forming a neighborhood. Figure 2. 1, centered 
around Xi and comprised of its Q nearest neighbors. The user 



1 A brief theoretical explanation of these techniques was 
presented in Chapter I. 
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determines Q by choosing the parameter F, which is approxi- 
mately equal to the percentage of the number of data points 
used in computing each fitted value. Q is (F x N) rounded to 
the nearest integer# and the Q nearest neighbors are those 
points whose X values are closest to Xi. Note that there 
are not necessarily an egual number of neighborhood points 
on either side of Xi. Also# Xi is considered to Le a 
neighbor of itself. The parameters F and Q, determined 
prior to smoothing the first data point# are held constant 
and used throughout the procedure. 




x 



Figure 2.1 Vertical Strip Containing the 
Heighbors of X6 in Data Set Two. 



0 Nearest 



In Figure 2.1# the point to be smoothed, X6, is high- 
lighted by a dotted line and the strip boundaries are delin- 
eated by solid lines passing through XI and X10. 

STEP TWO consists of defining the local weighting func- 
tion and calculating individual weights for each point# 
(Xk,Yk) , in the strip formed during STEP ONE. This weighting 
function is to be centered at Xi and scaled so that it hits 
zero for the first time at the nearest neighbor cf Xi 
(the strip boundary furthest from Xi). Functions having the 
following properties will satisfy these requirements: 
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for J U | < 1 (positivity). 



1. W (U) >0 

2. W <-U) = K (0) (symmetry), 

3. W (U) is a nonincreasing function for u > 0, 

4. W (0) =0 for |U| > 1. 



Cleveland, [Bef. 3], suggests using a tricube weight func- 
tion of the form: 



W(U) 



(1 - lul 3 ) 3 FOR lul < 1 
0 OTHERWISE 



Note that this function uses the absolute value of U. Ihe 
weight given to any point within the strip is calculated by: 



W(U) = W 



X i - X K 
D i 



The variable Di is the distance along the X axis from Xi to 

its Q — nearest neighbor. This is the distance from X6 to 

the left hand boundary in Figure 2.1. When LOWESS starts 

its smoothing pass at XI, the right hand boundary passes 
1 h 

through its Q*“ nearest neighbor, X10 in this example. The 
neighborhood which, at that time, contains the points XI ... 
Xg remains fixed until the distance ( Xi— XI) is greater than 
(Xg-Xi) . This usually occurs at i = Q/2 for evenly spaced 
data. At this point the neighborhood is advanced and the Q 
nearest neighbor shifts to the left hand boundary where it 
remains until all of the data points have been smoothed. Di 
therefore, is generally the distance from Xi to the right 
hand boundary for i = 1...(Q/2) and is the distance from Xi 
to the left hand boundary for i = (Q/2)...N. 

The weight given to any point in the strip is egual to 
the height of the ctrve, W (u) , at Xk, Figure 2.2. This 
figure demonstrates that the tricube weight function: 
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1. gives the largest weight to the point being smoothed; 

2. decreases smoothly as Xk moves away from Xi; 

3. is symmetric about the point being smoothed; 

th 

4. hits zero for the first time at the Q nearest 
neighbor of Xi. 




Figure 2.2 TRICtJEE Weight Function for the 10 Nearest 
Neighbors of X6 in Data Set Two. 

In cases where several points have abscissas equal to 
Xi, all of them are given weight 1. If Di is zero, meaning 
that all Q points in the strip have abscissas equal tc Xi , 
it is impossible to estimate the slope of a fitted line. In 
this instance, a constant equal to the nean Y value for all 
Q points is fitted tc the point ( Xi , Y i) . 

STEP THREE uses weighted least squares regression tc fit 
a polyncmial of degree P to the data points that lie within 
the strip containing Xi. The parameters of the equation 
that describes this line are the values of Bj j = 0,1,...P 
that ninimize: 



o 

Y \ Wk(U)(Yk - Bo - BiXk - ... BpXk ) 2 

K-1 
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Figure 2.3 shows straight ( p = 1 ) 
fit to the neighborhood points 
two . 



and quadratic (p=2) lines 
surrounding X6 in data set 



LINEAR 




QUADRATIC 




Figure 2.3 Linear and Quadratic Fits. 



The choice of an appropriate P depends on the user's 
perception of the relationship between the points within 
each neighborhood, the need for flexibility to reproduce 
patterns in the data, and computational ease. The existence 
of physical theories that define the relationships as being 
nonlinear might also influence this choice. Smoothed curves 
based on higher order polynomial regressions tend to fellow 
abrupt pattern changes better than those based on linear 
models. Cleveland [Bef. 3], feels that computational 
considerations begin to override the need for flexibility 
for values of P greater than 1. 

The smoothing routine written for this thesis is capable 
of performing linear cr quadratic regressions. Using p = 1 
or 2 should provide adequately smoothed points for any data 
set . 



The final 
portion of the 
smoothed point 



step in the Locally Weighted Regression 
LCWESS procedure is the determination cf the 

A 

(Xi,Yi), Figure 2.4, where; 
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A ’ 

Y i = L Bj(x,) • X , J 

J» t 

The notation used here emphasizes that the coefficients of 
the Xi are different for each point Xi. 




Figure 2.4 Scatter Plot of Data Set Two Superimposed 
With Smoothed Point (X6,Y6). 

LOWESS differs from most other smoothing routines 
because it smooths all of the data points. This becomes 
important when smoothing small data sets, when important 
pattern changes take place near the ends of the data set, or 
when the smoothed curve is to be used as a regression line 
to predict future trends. Figure 2.5 summarizes the sequence 
of steps described above, as they are used tc compute a 
"fitted” value for (X20,Y20) , the right hand end point in 
data set two. 

A comparison of Figures 2.1 and 2.5 reveals that the 
widths of the vertical strips about (X6,Y6) and (X2Q,Y20) 
are not equal. Note that the ten nearest neighbors of X20 
are all to the left. Although both strips contain ten data 
points, the requirement to center them around their 
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STEP 1 



STEP 3 





x 



X 



Figure 2.5 Summary of Steps Beguired for "Computing the 
Smoothed Value at (X20,Y20) in Data Set Two. 



respective (Xi,Yi) points forces the right hand portion of 
the weighting function in Figure 2.5 to fall off-scale. The 
left hand portion of the weighting function for (XI, Y1) is 
forced off scale for the same reason. These partial 
weighting functions still fulfill all of the reguire me nts 
outlined earlier, however. Unequal spacing of the X's also 
creates variable strip widths. 

A set of smoothed data points. Figure 2.6, is obtained 
by completing the aforementioned steps for each point in the 
original data set. 
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Figure 2.6 Plots of Lowess Smoothed Data Points and 
Smoothed Curve Superimposed on Data Set Two, (F=.5) . 

C. MATHEMATICAL DETAILS: ROBUST LOWESS SMOOTHING 

The robust smoothing feature of LOWESS prevents a small 

number of outliers from distorting the smoothed curve. The 

point (X10, Y10) in Figure 2.1 is one such outlier. 

The robust procedure computes a" new set of weights for 

A 

each (Xi,Yi) based on the size of the residuals, (Yi-Yi), 
obtained after the first smoothing pass. Figure 2.7. 

Cleveland [Ref. 3], suggests using a bisquare function 
of the form: 



D(V) 



(1 - V 2 ) 2 
0 



FOR |V| < 1 
OTHERWISE 



Robustness weights fcr each point are calculated by: 



d k (v) 




Rk 

6M 



where M is the median of the absolute value of the resi- 
duals, Figure 2.8. This is sometimes referred to as the 
Median Absolute Deviation (MAD). 
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Figure 2.7 Besiduals (li-Yi) Versus Xi for the 
Non-Bobust Smoothed Points of Data Set Two. 




Figure 2.8 



Bobust Weighting Function For the First 
Pass Through Data Set Two. 



This scheme gives small weights to points associated 
with large residuals and large weights to points with small 
residuals. One iteration of the robust locally weighted 
regression procedure is completed by calculating a new set 
of "fitted" values using the weighting function 

WT -- W(U)*D(V) 



in step three. 



26 



Execution of the entire LORESS algorithm consisting of 
one locally weighted regression pass and two robust locally 
weighted regression passes produces a robust smoothed curve, 
Figure 2.9. The effect of the "outlier" can be seen very 
clearly. 



N0N-R08UST tOWESS F 



ROBUST LOWESS F - .5 





Figure 2.9 Comparison of Non-Robust and Robust LORESS 
Smoothing of Data Set Two, (F=. 5). 



Cleveland [Ref. 3], reports that the number of computa- 
tions reguired to complete the LORESS algorithm on an entire 
data set is on the order of FN 2 . For example, 60 linear 
regressions were used to complete the robust smoothing of 
the 20 artificial data points in Figure 2.9. The non-rcbust 
curve, on the other hand, reguired 2/3 fewer calculations 
and took less than 1/2 the time. The number of calculations 
reguired to produce a smoothed curve presents no significant 
problem for plots of fewer than 100 points. Computational 
time can be saved by grouping the Xi's on data sets that 

have repeated X values. This saving results from the fact 

A A 

that if Xi+1 = Xi then Yi+1 = Yi. Assigning the same Yi 
value to each of the Ni repeated Xi’s reduces the number of 
regressions reguired by Ni for non-robust smoothing and by 
3Ni for robust smoothing. 
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D. CECOSIHG F 



There are no set criteria for choosing F. Small values 
produce curves with high resolution and a lot of ncise. 
larger F's produce curves with low resolution and less 
noise, but require increased computational time. In 
general, increasing F tends to produce smoother curves. 
Figure 2.10. Cleveland, [ Eef . 3], suggests that values 
between .2 and .8 should be satisfactory for most purposes. 
The goal is to choose the largest F that minimizes the vari- 
ability in the smoothed points without distorting patterns 
in the data. Computational time may become a consideration 
in choosing F when smoothing large data sets. In general 
though, F will decrease as the series length increases. 



ROBUST 10WESS F - .2 




x 



ROBUST 10WESS F - .3 




x 



ROBUST 10WESS F = .5 ROBUST LOWESS F = .7 





Figure 2.10 Comparison of Robust LORESS Smoothing of 
Data Set Two for Different Values of F. 



Smoothing routines, LOWESS included, do not provide 
regression equations or other analytical results on which to 
test goodness of fit. The user must judge the adequacy of 
the results. The choice of F is not so critical for cases in 
which the purpose of the smoothing is to enhance the visual 
perception of gross patterns in the data. For example, the 
rough curve obtained by using F=.2 on data set two, the left 
hand plot of Figure 2.10, provides an adequate picture of an 
overall increasing trend. More care must be taken in some 
applications, such as time series analysis, or when the 
smoothed (Xi,Yi) values may be used as a type of regression 
function, or finally, when the smoothed curve may be 
presented without an accompanying plot of the original data 
points. Taking F=.5 is a reasonable choice when there is no 
clear idea of what is needed, [Ref. 3]. Chambers, [Ref. 1], 
suggests that it is often wise to try several values of F 
before selecting the "best” one for a particular 
application. 

Techniques for determining bandwidth using techniques of 
cross-validation have been considered by Cleveland [Ref. 3], 
and Rice [Ref. 9], but are not included here. 
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III. EVALUATION OF THE LOWESS CURVE SMO OTH ING PROGRAM 



A. GENERAL 

Smoothing routines are generally used to filter noisy 
data and approximate underlying relationships that may be 
too complex to describe mathematically or too difficult tc 
fit by simple polynomial regression. Effective routines 
must be flexible and local. They must allow the data to 
determine the shape cf the smoothed curve and they must be 
able to follow abrupt as well as smooth changes in curva- 
ture. This evaluation will test LOWESS in each of these 
areas . 



B. METHCDOLOG Y 



LCWESS , like most other curve smoothing schemes, 

provides no analytical solutions by which to measure its 
effectiveness. The correctness or adequacy of the fit must 
be judged subjectively. And there are no standard guidlines 
to follow. Sometimes the shape of the fit can be checked by 
comparing it to the physical laws that govern the applica- 
tion at hand. The programs written to support this thesis 
were evaluated by: 

1. examining their performance on a set of test data for 
which the underlying functional relationships were 
known ; 

2. comparing their results with those obtained from 
widely used and previously validated curve smoothing 
techniques, namely; LEAST SQUARES REGRESSION, MOVING 
AVERAGE and COSINE ARCH weighted smoothing. 

The theory of moving average procedures dates back to 
definitive studies of discrete time series models completed 
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by H. Wold in the mid 1930’s. The general process is tased 
on the assumptions and theories recounted in Chapter I. The 
moving average is defined' by the expression 

N 

X( T ) = Z Aj Z(T-J) T = 0, 1 ... 

j=-u 

where M and N are ncnnegative integers and the weighting 
coefficients Aj are real constants. Kendall and Stuart 
[fief. 4], and Koopmans [Ref. 10], present in depth discus- 
sions and theoretical derivations that expand on the ideas 
presented in Chapter I. The moving average routine employed 
in this analysis is contained in the IBM GRAFSTAT statis- 
tical graphics package. The weighting function used in that 
program takes the form 

A . = — J = -M... N 
J M 

The COSINE ARCH smoothing procedure used here, is a 
moving average process that uses a cosine weighting function 
of the form 



A 



j 



1 

M-t- 1 



— COS J = 0. 1 N - 1 

M + 1 



It is characterized as a good smoother by Ansccmte, 
[Ref. 11], and is often used as a trend remover during time 
series analysis. 

C. TESTING PROCEDURES AND RESULTS 

Three sets of test data were developed to check all 
aspects of the LOWESS program's capabilities; its ability to 



follow linear trends as well as abrupt and smooth changes in 
curvature. 

1 . P hase One: Linear T ren ds 

Test set one# Figure 3.1# consists of 150 data 
points having the following functional relationship: 

Y = X + NORMAL(O.I) NOISE CKXsiO 

was designed to test IOWESS* ability to detect linear trends 
in noisy data. Although this test appears redundant, many 
complex smoothing procedures have failed because they did 
not return straight lines when that was the shape of the 
underlying curve. 




Figure 3.1 Test Set One With and Without N(0,1) Hoise. 

The adequacy of LOHESS* performance on test set one 
was measured by comparing it with a linear least squares 
regression line fitted to the same data. 

As pointed out in CHAPTER II, LOWESS produces 
increasingly smoother curves as the parameter F approaches 
1. When F=1 , each neighborhood used throughout the smoothing 
process contains N • 1 = N points. This implies that each 
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smoothed point (Xi,Yi) is computed from the equation of the 
TRICOBE weighted regression line fitted to all of the data. 
This procedure should produce a LOWESS smoothed curve that 
closely resembles the linear regression of Y 
TRICOEE weighting function used in LOWESS may 
disparities between the two "fits," however, 
inspection of the bottom two plots in Figure 



on X. The 
cause minor 
A visual 
3.2 reveals 



that LOWESS and 
identical "fits." 



the linear regression produced nearly 



LOWESS F-.2 LOWESS F-.5 





LOWESS f=1 LINEAR REGRESSION 





V - -0.16524 4 1.0143 * X 



Figure 3.2 Comparison of LOHESS Smoothing and Linear 
Regression of Test Set One. 

Goodness of fit can be measured by examining the 
A # 

residuals (Yi-Yi) from each smoothing procedure. A perfect 
reproduction of the underlying functional relationship, Y = 
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X, would produce a set of residuals distributed Normal (0, 1) , 
the same distribution found in the noise. The results of the 
GRAFSTAT distribution fitting proceedure summarized in Table 
II indicate that the distribution of the regression resi- 
duals can be approximated as Normal (0, 1 . 04) while the LOWESS 
residuals are approximately Normal (. 002, 1 . 0 1 6) . 

Hypothesis tests comparing the means and variances • 
of these distributions with those of the Normal (0,1) 
distributed noise, will provide some measure of the goodness 
of fit of each smoothing scheme. The results of these 
tests, conducted at the 95% confidence level, are summarized 
in Table I. 

The output of the GRAFSTAT distribution fitting 
procedure presented in Table II and the hypothesis tests 
summarized in Table I, suggest that there is no significant 
difference between the distribution of the residuals from 
the linear regression or LOWESS smoothing of test set one, 
and the Normal (0,1) noise incorporated into the data. This 
provides strong support for the premise that LOWESS depicts 
linear trends very well. Visual comparison of the LOWESS 
smooths in Figure 3.2 confirms that LOWESS follows the same 
general trend regardless of what F is used; small values 
provide rougher curves that have the same general slope. 











TABLE 


I 






Com 


parison of 


the 


Means and 


Variances 


of Residuals 


From 


Smooths of 


Test 


Set One 


to the Normal (0,1) 


Noise 








noise T 


Z{ 1-°^2) 




P • 


linear 


mean 


0. 000 


0 


0. 000 


1.96 


accept 


0. 05 . 


rgrsn 


var 


1. 040 


1 


0.346 


1.96 


accept 


0. 07 . 


LOWESS 


mean 


0.002 


0 


0.024 


1.96 


acce pt 


0. 05 . 




var 


1.016 


1 


0. 138 


1.96 


acce pt 


0. 06 . 
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TABLE II 



Summary of GBAFSTAT Distribution Fitting of 
Besiduals from Regression and LOWESS Smooths of Test Set One 

RESIDUALS FROM LINEAR REGRESSION 

normal distribution 



X 


RESO 


SELECTION 


ALL 


LABEL 


R£$Q 


SAMPLE size 


1 50 


MINIMUM 


'2.846 


MAXIMUM 


3 151 


CENSORING 


NONE 


EST. METHOO 


MAX I MUM 





SAMPLE 


FITTED 


COVARIANCE MATRIX OF 


MEAN 


2 .0898E” 1 4 


2 . 0898E" 1 4 


PARAMETER ESTIMATES 


STD DEV : 1 . 0295E0 


1 0295E0 


UJ 


SICma 


SKEWNESS: 1 . 1908E-1 


0 0000E0 


MU 0 0070189 0 


KURTOSIS: 3.1359E0 


3 0000 E 0 


SIGMA 0 


0 003533 


percentiles saaple 


FITTED 


COOONESS OF 


FIT 


5 


-1.7375 


-1 . 6938E0 


CHI-SOUARE 


2.3078 


10 


-1.3381 


-1 .3196E0 


OEC FREEO 


5 


25 


-0 591 52 


-6.9409E-1 


SIGNIF 


0 80513 


50 


-0.032298 


1 0399E-7 


KOLM-SMIRN 


0 040266 


75 


0.63234 


6 9409E“1 


SICNIF 


0 96816 


90 


l . 3208 


1 3196E0 


CRAMER- V U 


0 027624 


95 


1.7182 


1 6938E0 


SICNIF 


> 15 








ANOER-OARL 


0 17006 








SICNIF 


> 15 


KS. AO, 


, ANO CV SICNIF . 


LEVELS NOT 


EXACT WITH ESTIUATEO PARAMETERS 






0.95 CONFIOENCE INTERVALS 




parameter estimate 


LOMR 


UPPER 




mu 


2.0898E- 


14 -0 16424 


0. 16424 




SIGMA 


1 .0295E0 


0 92471 


1.1613 





RESIDUALS FROM LOWESS SMOOTHING 

NOPUAI. DISTRIBUTION 



X 


LOWfSS t£s/&<JAlS 








SELECTION : ALL 








LABEL 


LORES 








SAMPLE 


SIZE: 150 








MINIUM -2.909 








MAXI UM 3 090 








CENSORING NONE 








EST. LCTHOO: MAXIMUM LIKELIHOOO 










SAMPLE FITTED 




Covariance matrix of 


ME an 


: 0 016268 0 016268 




PARAMETER EST 


IMATES 


STO DEV : 1 0237 1 0237 




M’J 


SICMA 


SKfv^ESS: 0 093313 0 




0 0069398 


0 


KURTOSIS: 3 1452 3 




SICMA 0 


0.0034932 


percentiles sample fitted 




COOONESS OF FIT 


5 


-1 6646 “1 . 6679 




CHI -SQUARE ; 


1 .4385 


10 


-1.3315 -1 2958 




OEC FREEO: 


5 


25 


'0.55317 -0 6739 




SICNIF 


0.920G6 


50 


0 010179 0 016268 




koim-smjrn : 


0 047238 


75 


0.64998 0 70643 




SICNIF 


0 89136 


90 


1.2874 1 3284 




CRAMER-v m ; 


0 030631 


95 


1.7125 1 7005 




SICNIF 


> . 15 








ANOER-OARL : 


0 18198 








SICNir 


> . 15 


KS. AD. 


AND CV SICNIF. LEVELS NOT 
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WI TH EST IMATED 


PARAMETERS 




0 95 CONFIOENCE INTERVALS 




Parameter estimate lovtr 


UPPER 






mu 


0.016268 -0 14704 


0 17958 




SIGMA 


1 .0237 0 91948 


1 »548 
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2 . 



lliiise Two: Abrupt C h an ges in Curvature 

Test set two. Figure 3.3, consisting of 220 data 
points having the following mathematical relationship 



Y = 



• 4X + NORMAL(O.I) NOISE CKXS10 

3 + .IX + NORMAL(O.I) NOISE 10<X<:25 

14.6 - 3.67X + NORMAL(O.I) NOISE 25<X£40 
[0 + NORMAL(O.I) NOISE 40<X^44 



was used to test LOWESS’ ability to handle abrupt pattern 
changes. The smooth of test set two generated by LOWESS, was 
compared to those produced by MOVING AVERAGE and COSINE ARCH 
filtering of the same data. 




Figure 3.3 Test Set Two With and Without N(0,1) Noise. 

Determining the amount of smoothing required by a 
data set is, perhaps, the most difficult aspect of using any 
curve smoothing routine. Smoothness is controlled by the 
size of the parameter F in LOWESS and by the parameter M 
(bandwidth) in MOVING AVERAGE and COSINE AP.CH smoothing. 
These parameters determine the number of points, or neigh- 
borhood size, used to compute each smoothed value. The goal, 
regardless of the method chosen, is to use the largest 
neighborhood that minimizes the variability in the smoothed 
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points without distorting patterns in the data. Another 
factor that must also be considered when choosing M, is that 
MOVING AVERAGE and COSINE ARCH smoothing routines produce 
only (N— M) smoothed points. Using proportionately large 
values of M, therefore, might result in losing significant 
portions of the original pattern at the ends. This shortcom- 
ming will be evident in the graphical comparisons made 
throughout the remainder of this chapter. 

Comparison tests made during phases two and three of 
this evaluation used selected LOWESS smooths and corre- 
sponding MOVING AVERAGE and COSINE ARCH smoothed curves. 
Parameters for the three processes are directly convertible 
by the relationship M = F*N. 

Figure 3.4 presents graphical comparisons of LOWESS 
smooths (solid line) using parameter values F = .15,. 25,. 50 
and .75 to illustrate some of the considerations made during 
the parameter selection phase of 

a smoothing operation. The exact underlying relationships 
(dashed lines) were included to demonstrate how large values 
of F can cause pattern distortion. 

It is apparent from the sequence of illustrations in 
Figure 3.4, that ICWESS produces smoother curves as F 
increases. The smoothest curves are not always the most 
desireable, however. The bottom two curves (F=.50 and F=.75) 
have distorted the original pattern by using too many points 
to compute the smoothed values. Test set two contains 50 
points in the segment (0 <X < 10). Using a neighborhood much 
larger than 220*. 25 = 55 points on this data set would have 
a tendency to fit the wrong slope to the first linear 
segment. Additionally, it would cause over smoothing of the 
corners. Figure 3.5 shows the neighborhood and linear 
regression used to smooth the point (X10,Y10) during produc- 
tion of the smoothed curve (F=.75) pictured in the lower 
right corner of Figure 3.4. It is easy to see that following 
this slope would distort the pattern presented by the data. 
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LOWESS F - .15 




LOWESS F = .50 





Figure 3.4 Comparison of LOSESS Smoothing of Test Set Two 
Using Different Values of the Parameter F. 




Figure 3.5 Linear Begression Step in Smoothing (X10,Y10) 
in Test Set Two Using LOWESS iith F=.75. 



The F=.15 plot depicted in Figure 3.4 / demonstrates 
that small F*s create very locally smoothed curves that 
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contain a great deal cf noise but follow gross patterns very 
well. Using a small F is an excellent idea if the sole 
purpose of the smoothing is to highlight major trends in the 
dat a . 

The LOWESS smoothed curve obtained by using F=.25 is 
the one best suited fcr comparison with corresponding MOVING 
AVERAGE and COSINE 'ARCH smooths. Figure 3.6. 



TEST SET TWO 



LOWESS F m .2 





UOVINC AVERACE M - 44 COSINE ARCH U - 4 4 





Figure 3.6 Comparison of LOWESS, MOVING AVERAGE 
and COSINE ARCH Ssmoothing of Ttest Sset Two. 



Inspection of the plots in Figure 3.6 reveals that 
all of the smoothing procedures fit similarly shaped curves 
to most of the data. The inability of the MOVING AVERAGE and 
COSINE ARCH routines to smooth the extreme edges of a plot 
precluded them from fitting a curve to the last segment of 
test set two. Practitioners of these routines often extend 
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the curve or fit the ends by eye. Applying these techniques 
to the bottom curves in Figure 3.6 does not reveal any 
significant pattern changes. LOWESS, although it does not 
follow the level trend accurately, does reveal a major 
pattern change in the last section of the data. 

All three of the procedures have a tendency to round 
sharp corners as the parameters F and M are increased. The 
MOVING AVERAGE curve, in the lower left, has a very rcunded 
shape and does not highlight the linear trend in segments 
one or two. The COSINE ARCH filter does a little better. It 
portrays the linearity of section three with nearly the 
correct slope but fits segments one and two with one smooth 
curve. Additionally, it has added a misleading hump at the 
intersection of segments two and three. LOKESS is the only 
procedure that clearly pictures the underlying pattern as a 
series of straight lines. An experienced user who under- 
stands that LOWESS icunds corners, could almost duplicate 
the original pattern by connecting the linear portions of 
the curve. 

Smoothing procedures are not only judged on their 
ability to depict patterns, but are also rated on their 
ability to filter out unwanted noise. Gross differences in 
their capabilities can be picked out easily in a graphical 
comparison. It is readily apparent that the MOVING AVERAGE 
curve in Figure 3.6 is much noisier that either the LOWESS 
or COSINE ARCH smooths. 

A more analytical measure of a procedure's smoothing 
ability can be made by comparing periodograms of the unfil- 
tered and filtered data. A periodogram is an analysis tech- 
nique used to estimate the spectral density function of a 
time series at periodic frequencies, Xv. The periodogram 
function is defined by 
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chapter 8, 



Refer to Koopmans [Bef. 10], 
discussion of the periodogram and i 
ties. The periodograms in Figure 3. 



for a detailed 
ts distributional proper- 
7 provide 



Ttst SET TWO WITH NOISE 




10WESS F - .2 




frequency 

COSINE ARCH M =* 44 




TEST SET TWO WITHOUT NOISE 




FREQUENCY 

MOVING AVERAGE U - 44 




Figure 3.7 Comparison of Periodograms of LORESS, MOVING 
AVEBAGE and COSINE ABCH Smoothing of Test Set Two. 



comparisons of 
routine. The 



the filtering properties of each smoothing 
vertical lines on each plot represent 



periodicities, the spectral frequencies of which are 
measured along the abscissa. The height of the lines is an 
indicator of the significance of the associated frequencies. 
The plots in Figure 3.7, were truncated at Y = 6 to prevent 
the obscuration of the minor frequencies. 

A visual inspection of these periodograms reveals 
that LOWESS produces the smoothest (most noise free) curve. 
In fact, the periodogram of the LOWESS curve and noise free 
data are nearly identical. 

All of this evidence supports the conclusion that 
LOWESS performs at least as well on data sets that contain 
abrupt changes in curvature as do the widely accepted MOVING 
AVERAGE and COSINE ARCH procedures. 

3 • Ph ase Three : Smooth Changes in Curvature 

Test set three. Figure 3.8, comprised of 100 data 
points having the following relationship 

Y = SIN X + NORMAL(O.I) NOISE 0^Xi2 

was used to evaluate LOWESS* ability to follow smooth 
changes in curvature. The same procedures used in the 
preceding section to test LOWESS’ ability to handle abrupt 
pattern changes were applied here. 

Test set three appears to either have a negative 
linear trend, or appears to cycle about the line Y = 0. A 
series of LOWESS smooths. Figure 3.9, starting with a small 
F parameter, was used to discover the general pattern 
(dashed line) and refine the resulting smoothed curve (solid 
line) . The distorted smooth in the lower right hand plot 
demonstrates the inherent danger in selecting a large F if 
only ere smoothing pass is planned. 
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TEST SET THREE WITHOUT NOfSE 




X 



TEST SO THREE WITH NOr$E 




X 



Figure 3.8 Test Set Three With and Without N(0,1) Noise. 



iowess - .15 



icwrss - .25 




10WESS - .50 





Figure 3.9 Comparison of LOWE SS Smoothing of Test Set 
Three Using Different Values of the Parameter F. 



The LOWESS curve obtained by using F=.25 provided 
the most smoothing without distorting the pattern and was 
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used in a direct comparison with corresponding MOVING 
AVERAGE and COSINE ARCH smooths. Figure 3.10. The LCWESS 
smooth is the only curve that has the characteristic sinu- 
soidal shape. The MOVING AVERAGE plot, although very ncisy, 
would present the proper picture if the ends of the curve 
were extended. The radical change in curvature on the left 
end of the COSINE ARCH smoothed curve detracts from its 
abiliity to represent the true shape of test set three. 



TEST SET THREE 




MOV1NC AVERAGE M - 25 

i ■ 1 i ■ r - 1 i ' 1 -» ■ "" ■ t 




• * * ■* ‘ 

0 J ♦ • 

X 



LCWESS F - -25 




x 

COSINE ARCH U - 25 




x 



Figure 3.10 Comparison of LCRESS, MOVING AVERAGE and 
COSINE ARCH Smoothing of Test Set Three. 

Comparison of the periodograms presented in Figure 
3.11, shows, once again, that I0WESS produces the smoothest 
curve, while Figure 3.10 shows that it seems to follow the 
model the best. 
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TCST SO THREE with noise 





EREOUCNCT 

COSJNE ARCH U - 25 




TEST SO THREE WITHOUT NOISE 

r 1 t— i i — r— f 



■ ■ ■ ........ 

0 10 M SO *0 M 



rwcoucucr 



MOMNC AVERSE M - 25 




Figure 3.11 Comparison of Periodograms of LOWESS, MOVING 
AVEBAGE ans COSINE ARCH Smoothing of Test Set Three. 

The graphical comparisons made in Figure 3. 10 and 
3.11 demonstrate clearly that LOWESS performs at least as 
well as MOVING AVEBAGE and COSINE ARCH routines when 
smoothing data that has a smooth curvilinear pattern. 
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4 . 



Phase Four: Onecpual Spacing 

Besides being able to smooth all of the data points, 
IOWESS enjoys another possible advantage over MOVING AVERAGE 
type procedures, in that it was designed to work on unequal 
as well as equally spaced data. The definition of MOVING 
AVERAGES 

M 

Y , = I A/.-j 1 = 

J«-U 



holds only if the Yi's are equally spaced and have a linear 
relationship over the interval' (i-m) ... (i + m) . Violation of 
the linearity assumption introduces bias into the results 
while violation of the equal spacing requirement invalidates 
them. LCWESS would indeed enjoy a distinct advantage over 
MOVING AVERAGE type smoothing procedures if it produces 
acceptable results on irregularly spaced data. 

This section examines IOWESS* ability to smooth two 
different sets of this of type data. The first, natural log 
of energy dissipation versus depth. Figure 3. 12, is a trans- 
formed portion of data collected during a turbulence meas- 
uring experiment conducted by the Department of 
Oceanography, U. S. Naval Postgraduate School. 

The LOWESS curves obtained by using linear and quad- 
ratic regressions during Step Three of the smoothing proce- 
dure were compared to a quadratic least squares regression 
line fit to the same data. Figure 3.13 

Higher order regressions were rejected as plausible solu- 
tions because the regression coefficients Bj, j = 3,4,5... 
were found to be statistically insignificant compared to the 
Bj, 3 = 0,1,2 constants. A quadratic relationship also 
seemed to be a reasonable assumption since turbulence is a 
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Figure 3.12 Natural Log of Energy Dissipation vs Depth. 



QUADRATIC REGRESSION 




Y - +/C X X . 0 1 2 WHERE: C - -12.512 0.33412 - 0.0055612 



analysis or variance table 



scum* 


ss 


OF 


MS 


GRAND ICAN (SEE NOIE) 


10275 656 


1 




REGRESSION 


28 970 


2 


14.485 


residual 


73.094 


164 


.446 


total 


10377.719 


167 


62.142 


THE SIGNIFICANCE LEVEL 


OF REGRESSION 


- .0000 




(SIGNIFICANCE LEVEL - 


AREA UNDER CURVE 


BEYOTO COMMUTED 


n 


R SQUARE (SEE NOTE) 


- 


.284 





NOIE: IN WEIGHTED CASE. SEE DESCRIPTION FOR ICANING 



Figure 



3.13 Quadratic Regression and Analysis of Variance 
Table for Ln Energy Dissipation Versus Depth. 
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function of pressure which varies in proportion to depth 
squared. 

Figure 3.14 shows that the LOWESS curves (solid 
lines) for the linear (P = 1) smooths follow the general 
quadratic regression (dashed lines) for small values of F 
lut flatten the pattern for large F's. The quadratic (P = 2) 
LOWESS curves close in on the regression line as F increases 
and produce a fairly good match as F reaches .75. 

The quadratic LOWESS curve also appears to follow 
local peaks and valleys more accurately for small F's than 
does its linear counterpart. This is not unexpected. Figure 
3.15 shows that the characteristically bowed shape of a 

A 

quadratic curve produces larger Yi values in the middle of a 
data set (Xi is located in the middle of the LOWESS neigh- 
borhood) than a straight line fitted to the same data. 

The "fits" of Figure 3. 14 can be compared analyt- 
ically, as was done in the Phase One test, by examining the 
distribution of their residuals. Combining these analytical 
results with graphical comparisons provides some goodness of 
fit measure for the two curves. The nonparametric Smirnov 
two sample test [Ref. 12], is appropriate in this case 
because the distribution of the residuals is unknown. The 
results cf this test conducted at the 95% confidence level. 
Table III, indicate the there is no significant statistical 
difference between the F=.75 quadratic LOWESS curve and the 
quadratic least squares regression line. See the lower right 
hand plot of Figure 3.14 

This example demonstrates that LOWESS works quite 
well on unequally spaced data. It also shows that quadratic 
LOWESS werks better than the linear model when neighborhood 
sizes are too large to support the assumption that the 
neighborhood points are related linearly. Quadratic LOWESS 
should be used whenever the data suggests that that assump- 
tion is not true. 
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ROBUST LOWESS SMOOTHING: ENERGY DISSIPATION DATA 

LINEAR. F = .2 QUADRATIC. F = .2 









Figure 3.14 LOWESS Smoothing of Energy Dissipation Data 
using Linear and Quadratic regressions in Step Three. 

The second irregularly shaped plot to be smoothed, a 
lag-1 plot of 200 NEAE(1) random variables, is pictured in 
Figure 3.16 
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Figure 3.15 LOHESS Smoothing of X53 in Energy Dissipation 
Data Using Linear and Quadratic Regressions in Step Three. 







TABLE 


III 




Smirnov 


Test Comparinc 


3 the Distribution of 


Residuals from 


i Smoothing an< 


1 Regression 


of Energy Data 


type 


F 


T 


Ks 1 . 95 ) 


• 


lin 


.50 


.216 


. 149 


reject 


lin 


.75 


. 156 


. 149 


regec t 


guad 


.50 


.156 


. 149 


reject 


quad 


. 75 


. 078 


. 149 


accep t 



The NEAR (1) process, derived by Lawrence and lewis 
[Ref. 13], is a new first order autoregressive time series 
model with exponentially distributed marginals. NEAR (1) data 
is generated as a simple linear combination of a series, En, 
of independent exponential random variables by the model 



X N - 



+ BX n _, 
0 



W.P. A 
W.P. ( 1 -A) 



N = 0.1.2 ... 
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Figure 3.16 Lag-1 Plot of NEAB (1) Random Variables 
Having Autocorrelation .75. 

These NEAR(1) variables have some interesting prop- 
erties that make them especially suitable for testing 
smoothing routines. They have fixed serial lag-1 correla- 
tion, £) = AB and have conditional expectation 

L '[*n |X n-i = X 1 = O- A0 )X + ABX 

The following parameters were used to generate the variables 
for the test; A=. 83 , B= . 9 , X= 1. A successful smooth of 
Figure 3.16 should produce a straight line of the form 

Y = .25 + .75X 

not at all what one would expect from looking at the plot. 
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Figure 3.17 presents comparison plots of robust and 
non-rcbust linear regression and robust and non-rcbust 
LOWESS smoothing of the near(1) data of Figure 3.16. The 
robust regression function contained in the IBM GRAFSTAT 
package was used in this example. 

Examination of the plots in Figure 3.17 shows, once 
again, that LOWESS smooths are comparable to those produced 
by accepted linear regression techniques. It also reveals 
that neither the linear regression nor LOWESS procedures 
were able to reproduce the true lag-1 relationship, (Y = .25 
+ . 75X), shown in the lower right hand plot. Both robust 
curves do present an accurate picture of where most of the 
data points lie, and could be used to predict where a 
majority of the future points are likely to fall. Relying on 
these curves, however, would probably lead to the conclusion 
that the points abcve and below these lines represent 
outliers, which may cr may not be the case. 

It must be concluded from LOWESS' performance on 
these two data sets, however, that it smooths unequally 
spaced data as well as currently available regression 
techniques. 
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Figure 3. 17 
Regression 



Comparison of Robust and Non-Robust Linear 
and LCWESS Smoothing of the Lag- 1 Plot 
of NEAR ( 1) Data. 
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IV. USING THE APL VERSION OF LOW ESS 



A. OVERVIEW 

This chapter provides prospective users with detailed 

instructions for using LOWESS as a stand-alone program or in 

combination with the experimental GRAFSTAT graphics package. 

In either mode# LOWESS will provide the user with vectors of 

A 

robust or non-robust smoothed Yi values and their associated 
residuals. When used in conjunction with GRAFSTAT, it will 
also produce a scatter plot of the original data with the 
LOWESS smoothed curve superimposed. A similar type presenta- 
tion of the absolute value of the residuals versus Xi is 
also available on reguest from the program. Figure 4.1 



NON-ROBUST LOWESS SMOOTHING; F = .7 





Figure 4.1 Sa»ple of Graphical Outputs from LOWESS: 
Smooths of the Data (left) , and Residuals (right) . 



LOWESS is a completely interactive program. All user 
defined parameters and option selections are entered in 
response to program gueries. The stand-alone and combined 
graphics modes of operation are differentiated only by their 
initial set up procedures and by the choice of terminals on 
which the program is run. 
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Although no API programming skills are required to 
operate LOWESS, users should become familiar with system 
commands and procedures for entering the APL environment, 
loading and copying workspaces and variables and for saving 
workspaces by reading appropriate sections of [fief. 14]. 
Operating instructions presented in the follow-on sections 
of this chapter have been written for users who have had 
little or no experience with APL. Experienced users may find 
it more convient to refer to the summarized procedures 
presented in the Tables at the end of this chapter. 

LOWESS is not a fi.fi Church computer center supported 
program and is not included in any of the APL libraries 
listed in [fief. 15]. Interested users should contact 
Professor P.A.W. Lewis, Department of Operations Research, 
U.S. Naval Postgraduate School, for information concerning 
access to the APL workspace DTNLFNS. This workspace, which 
contains LOWESS and several other data analysis related 
programs, should be copied and stored on the user's A disk. 

B. TEBBIHAL BEQtJIBE BENTS 

LOWESS, in the s tand-a l on e mode can be run on any APL 
capable terminal at the 0. S. Naval Postgraduate School. The 
IBM GFAESTAT software, which generates the graphical 
displays when operating LOWESS in the combined graphics 
mode, requires the use of either IBM 3277GA or 3278/79 
graphics display terminals. The 3278 terminals require 
special modification tc produce graphical displays. Hone of 
these terminals are available for public use at the Naval 
Postgraduate School. See Table IV for a summary. 

C. PBOGEAfl INITIALIZATION: STAND-ALONE MODE 

Since LOWESS is written in APL, users must enter the APL 
sub-environment after completing normal log on procedures. 
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This is done by typing the letters "APL" and depressing the 
enter key. The response "CLEAR WS" indicates that the 
computer is ready to accept API commands. 

APL uses a special character set that is invoked by 
keying the APL ON/OFF key while depressing the ALT key on 
IBM 3278/79 terminals or by merely hitting the API ON/OFF 
key on the 3277GA graphics display terminals. These special 
APL characters are imprinted in red (3278/79 terminals) or 
black (3277GA terminals) on the top and front surfaces of 
the normal keys. The symbols located on the front of the 
keys are accessed by typing the appropiate key while 
depressing the APL ALT key. When two APL characters are 
pictured on the top surface of the same key, the uppermost 
character is invoked by hitting that key while depressing 
the SHIFT key, much the same as producing capital letters 
during normal typing operations. 

The final step in the initialization procedure consists 
of loading LOWESS and associated sub-programs into the 
active APL workspace. This is accomplished by entering the 
system command ") PCOPY DTNLFNS LOWESS " t. This command 
copies a group of programs required to execute LOWESS. See 
[Ref. 16 ,p.107], for information about the APL GROUP 
command. The computer responds by presenting WS size and 
"date-saved" information when all programs have been loaded. 
Initialization is new complete and the user is ready to 
execute LOWESS by typing "LOWESS" and hitting enter. From 
this point on, user enteries are made in response to program 
queries or instructions. Table I summarizes these initiliza- 
tion procedures. 



1 Underscored letters are obtained by typing the desired 
letter while depressing the APL ALT key. 
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D. PEOGEAM INITIALIZATION: COMBINED GRAPHICS MODE 

As noted in Section B of this chapter, the combined 
LOWESS-GRAFSTAT package can only be run on IBM 3277GA, 3279 

or specially conFigured 3278 graphics display terminals. 
Additionally, efficient operation of GRAFSTAT requires a 
minimum workspace size of 2 megabytes. The P.R. Church 
Computer Center has established a limited number of public 
domain workspaces with special account numbers and passwords 
to meet this need, £Eef. 5]. Hard copy graphics printers 
are available for use with the 3277GA terminals located in 
Ingersall, Root and Spanegall Halls. The remainder ct this 
section focuses on the use of the 3277GA terminals. 

Data files stored on the user's personal disk are 
unavailable for use while operating in one of the public 
workspaces. Users may: 

1. send files tc the public workspace's user number 
prior to logging on and commencing a work session; 

2. link to his/her own disk after logging on to the 
public workspace useing CP link procedures outlined 
in [Eef. 17]. 

After logging on to one of the public workspaces and 
completing the data transfer or linking procedures described 
above, the user must enter the APL sub-environment by typing 
"AP1GS7" 1 and hitting the enter key. The response, "CLEAR 
ES" indicates that the computer is ready to accept APL 
commands. 

The special APL characters, labelled in black, are 
invoked by depressing the APL ON/OFF key. Since this key 
also turns the APL characters off, it may be necessary to 
check their status by trial and error. Detailed instructions 



1. The command, "APIGS7", invokes special system routines 
required to support the IBM GRAFSTAT software package. This 
procedure may change. Contact Professor P.A.W. Lewis, 
Department of Operations Research, if these procedures do 
not work. 
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for using the APL character set are presented in Section C 
of this chapter. 

The initialization procedure is completed by loading 
GP.AFSTAT and LOWESS into the active APL workspace. GEAFSTAT 
should be loaded first, by entering the system command 
") LOAD GRAFSTAT" . The GRAFSTAT package is quite large and 
may take several minutes to load. The following set of user 
instructions will appear on the screen when GRAFSTAT is 
fully loaded: 



THIS IS A NEW (5/1/8 4) RELEASE OF GRAFSTAT. IT RUNS ON THE 
3277/GA OR ON THE 3276/79. IT HAS A NUMBER OF NEW FUNCTIONS. 
YOUR CID CONTROL VECTORS WILL WORK AS BEFORE. IF YOU )CCPY 
RATHER THAN ) LOAD THIS WORKSPACE YOU MUST EXECUTE THE 
FUNCTION LATENT BEFORE STARTING. THE NEXT RELEASE IS 
SCHEDULED FOR 7/84 . 

TO BEGIN, TYPE: START 

FOR MORE INFORMATION, TYPE: DESCRIBE 

It is not necessary for the user to start, or even 
interact with GRAFSTAT to smooth a set of data: the GRAFSTAT 
message may be cleared by depressing the CLEAR key. 

Users who have the APL workspace DTNLFNS stored on the 
public workspace disk, or who are linked to their cwn 
personal disk where it is stored, need only enter ") FCOPY 
DTNLFNS IOW ESS " to complete the initialization process. The 
computer responds by presenting WS size and date saved 
information when all programs have been leaded. 
Initialization is now complete and the user is ready to 
execute LOWESS by typing "LOWESS" and hitting enter. From 
this pcint on user enteries are made in response to program 
queries cr instructions. See Table VI for a summary of these 
procedures. 
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E. OPERATION OF LOSES S 



This section provides detailed descriptions of the user 
inputs required during normal operation of LOWESS. The 
discussion assumes that one of the initialization procedures 
described in Sections C and D of this chapter has already 
been completed. 

Execution of thQ LOWESS program is initiated by typing 
"LOWESS” and hitting the return key. Since the program is 
interactive it will respond with a series of queries or 
instructions requesting the user to input data or make deci- 
sions about the operation of the program. The exact sequence 
of program initiated gueries and instructions is formulated 
in response to user inputs. 

User-computer interactions required during execution of 
LOWESS are categorized into two types; data input and 
program operation. 

Since the program cannot operate without data, the 
initial concern of LOWESS is to locate and read the data set 
it is about to smooth. Data can be read from the active APL 
workspace, a stored AEL workspace or from a stored CMS file.. 
Data that is not located in the active workspace must be 
accessible from that workspace. This presents no problem 
when the user is operating under his/her personal user 
number and the data is stored on his/her disk. This may 
become a problem when the user is logged on to one of the 
public workspaces described in Section D of this cahapter, 
and has not: 

1. sent the data to the public workspace where he/she is 
working and stored it on the assoceated A disk; 

2. linked to his/her own disk prior to entering the APL 
sub-environment, see Section D of this chapter. 

Wherever the data is stored, it MUST be formatted into 
two separate lists, one containing the X values and the 
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other containing the corresponding Y values of the points 
being smoothed. 

Data which resides in the active workspace as API 
vectors 1 is entered into LOWESS when the user types the 
variable name and hits enter in response to appropriate 
program requests. 

Data which is stored in another API workspace on the 
disk in use or on a disk to which the user is linked, will 
be transferred to the active workspace by the sub-program 
DATAINPOT. The user needs only to enter the workspace name 
and variable names when requested. DATAINPUT will also read 
and convert CMS files stored on the disk in use or on a disk 
to which the user is linked, provided they are formatted as 
described above and contain only numerical data. A mixture 
of alphabetic and numeric characters in a CMS data file will 
create an error and terminate execution of LOWESS. These 
data transfer features will work equally well in either mode 
of operation. The IEM GRAFSTAT program contains functions 
entitled CMS READ and CMS WRITE that will convert data in 
both directions when operating in the combined graphics 
mode. Users will generally not need to use this feature of 
GRAFSTAT, however. 

Program operation inputs include: 

1. the value of the parameter F (selection considera- 
tions are discussed in Chapter II Section C) ; 

2. whether robust or non-robust smoothing is desired; 

3. whether or net a plot of the original data and 
smoothed curve is desired; 



1 In API, a list of data points stored under a single vari- 
able name is referred to as a vector. See [Ref. 14], for 
further details. 
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4 . 



whether or not a plot of the absolute values of the 
residuals and associated smoothed curve is desired; 

5. X and I axis labels for these plots. 

Plots can only he generated while operating LOWESS in 
the combined graphics mode. Requesting plots when GRAFSTAT 
has not been loaded will produce an error and terminate 
execution. Hard copies of plots may be obtained by 
depressing the HARD COPY button on the bottom of the 
graphics screen. 





TABLE 


17 






Summary of Terminal Requirements and 
Available Outputs 




Stand-Alone Mode 




Combined Graphics 


Terminal 

Required 


3277GA 3278 3279 




3277GA, 3279 or 3278 
with graphics board 


Addit icnal 

Software 

Required 


ncne 




IBM GRAFS TAT pgm. 


Available 

Output 


Numerical; 

YSMTH .. smooth Y 
XI ... original X 
Y 1 ... original Y 
RESY . . residuals 


- 


Numerical: 

YSMTH .. smooth Y 
XI ... original X 
Y 1 ... original Y 
RESY .. residuals 








Graphical : 

Smooth curve 
| Residuals) vs Xi 
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TABLE 7 




Initialization 


Procedures, Stand- 


Alone Mode 


Objective 


Dser Inputs 


Program Response 


(1) enter APL 
environment 


••APL •• 


"CLEAR WS" 


(2) invoke APL 
characters 


APL ON/OFF key 


none 


(3) load LOWESS 
and assoc, 
programs 


) PCOPY DTNLFNS 
LOWESS 


•’saved (date) (time) " 







TABLE VI 






Initialization Procedures, Combined Graphics 


Objective 


User Inputs 


Program Response 


(1) 


enter APL 
environment 


"APLGS7" 


"CLEAR WS" 


(2) 


invoke APL 
characters 


APL ON/OFF key 


none 


(3) 


load 

GRAFSTAT 


") LOAD GRAFSTAT" 


initialization 
screen, see p 59 


(4) 


load 

LOWESS 


") PCOPY DTNLFNS 
LO WESS" 


"saved (time) (date) " 


(5) 


execute 


"LOWESS" 
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V. USING I BE FORTRAN VERSION OF LOWESS 



A. 



OVERVIEW 



This chapter provides prospective users with detailed 
instructions for using a FORTRAN program that accomplishes 
the LOWESS curve smoothing procedure described in Chapter 
II. The program, entitled LOWESS, will provide the user with 
CMS files containing robust or non-robust Yi values and 
their associated residuals. These data files can be used to 
create plots of the raw and smoothed data points using 
DISPLA [Ref. 7], EASYPLOT, or other W.R. Church computer 
center supported IMS! or NON-IMSL plotting routines. 

LOWESS is a completely interactive program. All user 
defined parameters and option selections are entered in 
response to program gueries. 

Although no FORTRAN programming skills are required to 
operate LOWESS, users should become familiar with FORTRAN 
and WATFIV operating system commands and also with the basic 
XEDIT editor, by reading appropriate sections of [Ref. 18],' 
and [Ref. 19]. A limited ability to format, XEDIT and 
manipulate data files will be helpful when using LCWESS or 
when interacting with any of the plotting routines mentioned 
earlier . 



B. TERMINAL REQUIREMENTS 



LOWESS can be run on any remote terminal attached to the 
IBM computer located at the Naval Postgraduate School. The 
DISPLA and EASYPLOT plotting routines require the use of the 



IBM 3277GA graphics display 
Root and Spanegall Halls. 



terminals located in Ingersall, 
Plotting routines that use the 



remote VERSETEC 
terminal. 



or line printers can be accessed from any 
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C. PROGRAM INITIALIZATION (FOETRAN VERSION) 



Since LOWESS is not a W.R. Church computer center 
supported program, it is not available in any of the 

center's public access libraries. Interested users should 
contact Professor P.A.W. Lewis, Department of Operations 
Research, U.S. Naval Postgraduate School, for information 
concerning access tc LOWESS and its supporting programs. 
Copies of the programs listed in Table VII should be 
obtained and stored cn the user's A disk. Annotated copies 
of the source codes are contained in Appendix (B) . 





TABLE VII 




Programs and 


Subroutines Re 


quired for the 


Operation and Support of the PORT 


RAN Version of LOWESS 


Filename 


Filetype 


Filemode 


LOWESS 


FOETRAN 


A 1 


LOWS 


EXEC 


A 1 


PXSOBT 


FORTRAN 


A 1 


LLBQF 


FROTRAN 


A 1 



PXSOBT and LLBQF are contained in the IMSL library. 
Users having access to these programs through the W.R. 
Church computer center need not obtain personal copies. 

The LOWS EXEC is used to activate system libraries, 
designate CMS storage space required for LOWESS input and 
output files. It is invoked by typing "LOWS EXEC" and 
hitting the ENTER key. The file definitions contained in the 
LOWS EXEC are listed in Table VIII. See [Ref. 17 ], for info- 
mation on the use of EXEC executive programs. 

This EXEC defines enough file space to accomodate five 
data sets. The user need only enter the appropriate file 
number when queried by LOWESS, to smooth any of the data 
sets . 
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TABLE VIII 




Input and Output 


File Definitions 


Used in LOWS 


File number 


Filename 


Filetype 


2 


L0W2 


DATA 


3 


LOW 3 


DATA 


4 


L0W4 


DATA 


7 


L0W7 


DATA 


8 


L0W8 


DATA 



It may become necessary to change these filenames to 
avoid losing data when smoothing a large number of data sets 
or when smoothing one set a number of times. This may be 
accomplished in one of the following ways: 

1. by entering the CMS command "XEDIT LOWS EXEC " and 
changing the appropriate names; 

2. by using the CMS command "R (old filename) (old file- 
type) (old filemode) (new filename) (new filetype) 
(new filemode) " for each file needing to be changed, 
see [ Bef . 18 ]. 

File management is important. It is absolutely impera- 
tive that data input files have the same filename, filetype 
and filemode listed in the LOWS EXEC to prevent inadvertant 
smoothing of the wrong data or to prevent programming error. 

D. DATA FILES (FORTRAN VERSION) 

LOW ESS requires that data be input in two columns of 
floating point constants in (2F15.5) format, X values on the 
left and Y values on the right. This is accomplished by 
creating a new file with the command "XEDIT (filename) 
(filetype)." The filename and filetype chosen should be one 
of these listed in Table VIII or one that is contained in 
the user’s own LOWS EXEC. Refer to [Ref. 19], chapter 2, for 
more detailed instruction on creating files. The (2F15.5) 
format requires that all input variables contain a decimal 
point followed by nc more than five decimal places. The X 
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values must be entered in the first fifteen spaces and the Y 
values in the second fifteen spaces of each line (one set 
per line) . 

The output from ICWESS is placed in a file designated by 
the user. This can be the same file used for inputting the 
(X,Y) values or a different one. A different file should be 
used if the same data set is going to be smoothed with 
several different parameters. This output is printed in 
(4F15.3) format. The first column is the original X values 
ordered from smallest to largest. Column two contains the 
corresponding Y values, while column three contains the 
smoothed Yi values and column four contains the (Yi-Yi) 
residuals. 

E. OPERATION OF LOWESS (FORTRAN VERSION) 

This section provides detailed descriptions of the user 
inputs required during normal operation of LOWESS. The 
discussion assumes that the LOWS EXEC has been properly 
prepared and executed and that input files have been built 
according to instructions presented in Section C of this 
chapter . 

Execution of the LOWESS program is initiated by typing 
"WATFIV LOWESS * (XT YFE" . Since the program is interactive, 
it will respond with a series of gueries or instructions 
requesting the user to input data or make decisions about 
the operation of the program. 

The initial concern of LOWESS is to locate and read the 
data set it is about to smooth. Data can only be read from 
one of the files defined in the LOWS EXEC routine. The user 
tells LOWESS what file to read by entering the appropriate 
file number (2, 3, 4, 7 or 8) in response to the instruction 
’’ENTER TEE FILE NUMBER OF THE INPUT DATA FILE.'' The program 
will terminate with an error if the LOWS EXEC was not 
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properly prepared or if the data file was not formatted as 
described in the preceding section. Other program requested 
inputs include: 

1. the value of the parameter F (selection considera- 
tions are discussed in Chapter II Section C) ; 

2. whether or robust or non-robust smoothing is desired; 

3. the file number of the desired output file. 
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APPENDIX a 
APL PROGRAMS 



This Appendix contains annotated listings of the APL 
programs written for this thesis. Source listings of the 
system library programs used to support the CMSREAD function 
called in the program DATAINPUT are not included. 

LOWESS is an interactive program that executes the 
Robust-Locally-Weighted Regression Scatter-Plot Smoothing 
procedure described in the preceeding sections of this 
paper. It calls the following subprograms; DATAINPUT, 
REPEATCK, REGRES, BEGRES2 PLOTQUER Y and LOWS during execu- 
tion. Refer to Chapter IV for detailed user instructions. 



>»LOWESS 

CO] 



[ 1 ] 

C2] 

[3] 

C4] 

[3] 

m 

C7] 

[ 8 ] 

[9] 

[19] 

->C11] 

->[ 12 ] 

C13] 

CM] 

[13] 

CIA] 

[17] 

[18] 

[19] 

[ 20 ] 
[21 ] 
[ 22 ] 

[23] 

[24] 

[25] 

[26] 

[27] 

[28] 

[29] 

[30] 



L0WESS,N,Q,WX, J, I , A, B, Q, STRP,U, D ,TX, WT, Z, BR , DA , DB, R,U1 ,H,R0, 
AR i RHS , PROCEED , N 1 i PT , SKP , YS , P , ROB , REG , XAX I S i YAX I S , 
PHDR,QS5»QS6,PT 

aaa DO NOT MOVE OR ERASE, GRAFSTAT FUNCTION HEADER 
AM GRAFSTAT WILL NOT ADD A LINE TO THIS FUNCTION WITHOUT 
aaa THIS HEADER 

AAA 

aaa LOWESS CALLS THE FOLLOWING PROGRAMS AND VARIABLES: 
aaa DATAINPUT, REPEATCX , PLOTQUERY , REGRES, REGRES2, RPLT , 
aaa NRPLT, RESPLT , SRESPLT 

AAA 



0PP*6 

DATAINPUT 

4L9m (PROCEEDIN' ) 



->0 



L9:Y1«-Y<-Y[+X]-I 0RDE p DATA 
Xl>-X<-X[*X] J 
•INPUT F ... (QSFS1)' 

QHO.S+Q«-<N1*-pX)xF«-0 

•DO YOU WANT TO USE LINEAR OR QUADRATIC FITTING DURING 
'THIS SMOOTHING ROUTINE?' 

• (LIN OR QUAD) * 

REGMTO 

•DO YOU WANT TO USE THE ROBUST SMOOTHING OPTION?' 

' (YES OR NO) ' 

ROBHTO 
YS4-N1P0 
UX4-N1 p 1 



J4-0 

LI : JKJM 
IF0 
AM 
B«-Q 



COUNTER FOR ROBUST SMOOTHING LOOP 
STARTS FIRST STRIP AT X, ... Xq 
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[31 ] 
-*[32] 
[33] 
-*[34] 
[35] 
-*[36] 
[37] 
-*[38] 
[39] 
-*[40] 
[41] 
-*[42] 
[43] 
-*[44] 
-*[45] 

[46] 

[47] 



L2 : I*- 1 + 1 INCREMEMENTS THROUGH X. ...X N 

-*L6x\(I>N1) A 

REPEATCK PREVENTS COMPUTATIONS OF Y. FOR REPEAT X. 
+L5 x i < SKP= ' Y ‘ ) 1 

STRP + < A+ ( 0 , t (B-A) ) ) 

-*L3x \ O^D + T /|U+<X[I]®.-X[STRP]> . COMPUTES D, 

YS[I]+(+/( L5T/Y ) )+<+/LST+X=X[I]> USES AVG Y| IF D, = 0 
-*L5 



L3:WT+UX[STRP]xTX+( ( 1 - ( |U*3)>*3)x(( |U+U+D><1 ) TRICUBE WT FCN 
L4:-*R2x\ (REG^'L' ) ~ 



X[STRP] REGRES Y[STRP] 



WEIGHTED REGRESSIONS 



-*L5 

R2 • X[STRP] REGRES2 Y[STRP] _ 

L5 : -*L2x \ ( B2N1 >v(l^N1 > 

-*L2x\ ( (DA<-(X[I + 1 ]-X[A] ) )£(DB+(X[B+1 ]-X[I + 1 ]) ) ) 
A+A + 1 , 

B+B+1 



ADVANCE STRIP 



->[48] 

[49] 

-*[50] 

[51] 

-*[52] 

[53] 

[54] 
->[55] 
->[56] 

[57] 

[58] 
-*[59] 
->[60] 

[61] 

[62] 

[63] 

[64] 

[65] 

[ 66 ] 



-*L5 

L6 : RQ + 1 R[ + ( | R+RESY+< Y-YS ) ) ] 

->L10x» (0?!h*-0.5x+/| (R0[(rN1-*-2) , 1+LN1+2]) > 
U1 H 
->L1 1 

LI 0 : Ui «-R-r<6xM> 

LI 1 :WX+( (1-<U1#2> )*2>x< ( | UI > <1 ) 

->L7xi <ROB*'Y' ) 

-»L1 xi (J<;2) 

L7 : PLOTQUERY RUN PLOTS 
YSMTH+YS 

«-*L8x i <PT>* ' Y ' ) 

fl-»0 

L8 : ' THE OUTPUT FROM THIS LOUESS SMOOTHING 
‘FOLLOWING VARIABLE NAMES:' 



YSMTH SMOOTHED Y VALUES' 

XI X VALUES ARRANGED IN 

Y 1 ORIGINAL Y VALUES' 

RESY RESIDUALS' 



BICU8E WT FCN 



IS STORED UNDER THE’ 



ASCENDING ORDER' 



DATAINPUT controls the data entry portion of the proce- 
dure. Data and program operating parameters are entered in 
response to program gueries. DATAINPUT accepts data that is 
stored in the active APL workspace, transfers data from 
other APL workspaces and converts CMS data into APL. 



»*DATAINPUT 

CO] DATAINPUTjQSI jQS2,QS4 

Cl] PRQCEEDt-'Y' 

C2] • • 

C3] 1 IS YOUR DATA SET LOCATED IN THIS WORKSPACE?' 

[4] '(YES OR NO)' 

[5] QSIHtQ 



-♦C-S] 

C7] 

CO] 

C9] 

CIO] 

-*C11] 

C12] 

C13] 

CH] 

C15] 

C(6] 

CI7] 

CIO] 

Cl?] 

tC20] 

C21] 

C22] 

C23] 
C24] 
C25] 
C26] 
C27] 
C20] 
C29] 
C30] 
C 3 1 ] 



-*LP1 * 1 (QS| = 'N* > 

'ENTER THE NAME OF THE X VARIABLE' 

Xt-Q 

•ENTER THE NAME OF THE Y VARIABLE' 

YtO 

-*END 

LP1:'IS YOUR DATA LOCATED:' 

' (1) IN AN APL WORKSPACE LOCATED ON THIS DISK OR ON A DISK' 

• THAT YOU ARE LINKED TOj ' 

' (2) IN A CHS FILE ON THIS DISK OR ON A DISK THAT YOU ARE* 

• LINKED TOj ' 

' (3) NEITHER ( 1 ) OR (2) ABOVE.' 

•ENTER (1,2 OR 3) ' 

QS2fO 

i(LP2,LP3,LP4>CQS2] 

LP2 : ' TO TRANSFER YOUR DATA TO THIS WORKSPACE:' 

' (1> TYPE . . . )PCQPY (US NAHE) (X VARIABLE NAHE) (Y 

VARIABLE NAHE)' 

• EXAHPLE ; >PCQPY DATA X Y' 

' IF YOUR DATA IS STORED AS TWO SEPERATE VARIABLES' 

• (2) TYPE ...)PCOPY (WS NAHE) (VARIABLE NAHE)' 

' EXAHPLE: >PCOPY DATA ARRAY* 

' IF YOUR DATA IS STORED UNDER A SINGLE VARIABLE NAHE' 

' AS IN A TWO DIHENSIGNAL ARRAY' 

i i 

' DATE AND TIHE SAVED INFQRHATION IS DISPLAYED' 

' WHEN THE TRANSFER IS CGHPLETE. THEN ENTER GO 



C32] 
C33] 
C34] 
C35] 
• C36] 

C 37 ] 
C38] 
-♦C39] 
C40] 
C41 ] 
C42] 
C43] 
-»C‘H ] 
C45] 
C44] 
C47] 
C4B] 
C49] 
C50] 
1C51] 
C52] 

C53] 

CS4] 



' TO CONTINUE THE LOWESS SMOOTHING PROGRAM' 

SADATAINPUTt-GQ 

GO: 'DO YOU NEED TO DEFINE YOUR X AND Y VARIABLES ANY FURTHER?' 
'ANSWER NO IF YOU ENTERED SEPARATE X AND Y VARIABLE NAHES' 

•IN THE PRECEDING STEP. OTHERWISE ANSWER YES.' 

'(YES OR NO)' 
fiS3HtQ 

lENDM (QS3='N* ) 

•DEFINE THE X VARIABLE' 

' Xt-0 

'DEFINE THE Y VARIABLE' 

Y«-0 

-♦END 

LP3 : ' TO TRANSFER YOUR CHS DATA FILE TO THIS WORKSPACE:' 

• (1) ANSUER THE FOLLOWING QUESTIONS ABOUT YOUR X DATA FILE' 
Xt-CHSREAD 

• (2) ANSWER THE FOLLOWING QUESTIONS ABOUT YOUR Y DATA FILE* 
Y*-CHSR£AD 

'YOU ARE NOW READY TO PROCEED WITH LOWESS' 

-♦END 

LP4 : ' YOUR DATA MUST BE STORED IN AN APL WORKSPACE OR IN A CMS 
FILE' 

'LOCATED QN THIS DISK OR ON A DISK TO WHICH YOU ARE LINKED. 
LOWESS' 

•IS BEING TERMINATED. PLEASE COMPLY WITH CONDITION ( 1 ) OR (2) 



C55] 'AND REINITIATE LOWESS. ' 
C5£] PROCEED*- ' N ' 
r57] END:SaDATAINPUT<-0 
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REPEATCK reduces the number of computations required to 
smooth a data set by assigning the same smoothed Y value to 
data points that have the same X value. 



REPEATCK 
CO] REPEATCK 
[1] SKP«- 1 N ' 

-*C2] -*ENDx\ (I<;i ) 

->C3] -»ENDx\(XCi:^XCI-1]) 

[4] YSCI]<-YSCI-1 3 

[5] SKP+- ' Y ' 

C6] END: 



E10TQ0ERY controls the the graphical output when oper- 
ating with the IBM GRAFSTSAT statistical graphics package. 
It calls the sub program LOWS to smooth the absolute value 
of the (Yi-Yi) residuals obtained from smoothing the orig- 
inal data. 



**PLOTQUERY 






[03 


PLOTQUERY 






[13 


1 l 






[23 


'DO YOU WANT A PLOT OF YOUR LOUESS SMOOTHED 


CURVE 


[33 


'(YES OR NO) ENTER NO IF NOT USING 


GRAFSTAT 


[43 


PT<-1ta 






-*[53 


-»ENDM (PT^'Y' ) 






[63 


' INPUT X AXIS LABEL' 






[73 


XAXISH3 






[83 


•INPUT Y AXIS LABEL' 






[93 


YAXISH3 






-*[103 


■+PL1 x 1 < ROB^ ' Y ' ) 






[113 


PHDRe 'ROBUST LOUESS SMOOTHING; F = ' , TF 






[123 


BUN RPLT 






-*[133 


-*PL2 






[143 


PL1 : PHDR<- ' NON-ROBUST LOUESS SMOOTHING; F 


* \ 


TF 


[153 


BUN NRPLT 






[163 


PL2 : ' DO YOU UANT A PLOT OF | RESIDUALS | VS 


X?' 




[173 


' (YES OR NO) ' 






[183 


QSSH+a 






-*[193 


-*ENDx\ (QS5^‘ Y' ) 






[203 


'DO YOU UANT THIS PLOT SMOOTHED?' 






[21 3 


' (YES OR NO) ' 






[223 


QS6<-1 ta 






•*[233 


-*PL3x\ ( QS6* ' Y ' ) 






[243 


X LOUS ( IRESY) 






[253 


BUN SRESPLT 






-*[263 


-»END 






[273 


PL3 : BUN RESPLT 






[283 


END: 
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A 

LOWS is used to smooth the ( Y i — Yi) residuals obtained 
from smoothing the original data set. It operates exactly 
like 10WESS except for the data input and graphical output 
setctions. 



HHM_OUS 

C93 


X LOWS Y; N1 ;Q;UX; J; I; A; B; Q; STRP; U; D ; TX; UT; Z; BRj DAi DB: R; U1 ;M; 
RO)AR;RHS;YZ 


cn 

C23 
C33 
C43 
C33 
C63 
C?3 
C8] 
C?3 
C1©3 
C1U 
4 C1 2] 
Cl 3] 
+C1 4] 
Cl 33 
+C1&3 
Cl 73 
• Cl 83 
-»C193 
C203 
+C213 
C223 
-»C233 
C243 
+C233 
+C263 
C273 
C283 
-*C293 
C3©3 
-*C313 
C323 
+C333 
C343 
C353 
+ C363 
+C373 
C383 


Y«-YC+X3 

Xf-XC*X3 

Qf*L0.5+Q«-<N1«-pX)xF 
YSf-N1 p© 

UX«-N1 p 1 
J«-0 

LI : 

If-© 

AH 

BHJ 

L2:I«-I+1 
-K.6XI ( I >N1 ) 

REPEATCK 
-K-5x x (SKP*‘ Y' ) 

£TRP«-( A+<9 , \ <B-A) ) ) 
-M_3xi©?<D«-r/!U«-(XCI3*.-XCSTRP]) 
WT«-UXC3TRP]xrX«-Qp1 
YSCI3*-<+/(lsr/Y)) -i-<+/LST«-X=*XCI-M 3 
-»L3 

L3: UT«-WXCSTRP]xTX«-C ( 1-< |U«3) )»3) x( ( IUHJ+D) < 1 ) 

L4:-»R2x< <REC?< 'L' ) 

XCSTRP3 REGRES YCSTRP3 
-»L3 

R2 : XCSTRP3 REGREX2 YCSTRP3 

L3 : -»L2x x ( B£N1 )v(l^N1 ) 

-»L2x » ( ( DAHXCI-M 3-XCA3) )i(DBf-(XCB+1 3-XCI+1 3) ) ) 

Af-Af-1 

B«-B+1 

-»L5 

L6:R0«-|RC4( I R«-< Y-YS) >3 
■+L1 0x \ <0?*M«-©.5x+/| <R0C(rN1+2) , l+LNIf-23)) 

U1«-1 
-M_1 1 

LI 0 : U1 <-R+ ( 6xH ) 

L1 1 :UX*-< < 1 -(U1 *2) )*2) x( ( | (J 1 ) ( 1 ) 

■+L1 2* 1 ( ROB#' Y * ) 

■+L1 x 1 < J£2) 

LI 2 : 
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REGRES computes linear least squares regressions of Y on 
X while EEGEES2 computes quadratic least squares regressions 
of Y on X. 



kREGRES 

m 




XR REGRES YR ; DEN ; W1 ; P 1 i P2 

DEN«-< (+/U1 )x (+/U1 xXR#2) )-< <+/XRxUH-UT«0.5)*2) 
->tl x\ ( ( | DEN) 10 . 0001 ) 

YStn«-<+/YR)*pYR 



L1?B2<-< ( <+/W1 ) x ( +/<U1 xXRxYR) ) )-< ( + /M1 xXR) x ( +/W1 x YR) ) ) + DEN 
£t 1 M ( i/UI xYR)-P2x(+/W1 xXR) )-f ( t/U1 ) 

YSCI]«-B1+B2xXCI] 



x REG RES: 

r0‘ 

i 

3. 

Cs' 

C8] 



X2 REGRES2 Y2 
A1 < */X2x<UT*0.5> ) 
A2<-<+/(X2#2>x<UT*0.5)) 
A3M l/<X2*3) XUJTX0.5) ) 
AR2^ 3 3 £< -i /UTxO.5 ) , A1 



,A2,A1 ,A2, A3,A2,A3, <+/<X2M>x<WTh0.5) > 
<+/X2xY2xUT#0.5) 



RHS2M f-/Y2xUT*0.5) . 

RHS2<- 3 1 PRHS2, < +/ < X 2 # 2 ) x Y?.xlJT*0 . 5 > 
YS[I]«-BRC1 1 < ] + (BRC 2 ) 1 ]xX[ I] ) + < BRf.3 5 1 3*XC I ]*2) 



The following character strings are the screen vectors 
used by the RON function of GEAFSTAT to produce the plots of 
the ICHESS smoothe curves of the original data and absolute 
value of the residuals. 

:tn*NRPLT 73 CHARACTER 

MOX10Y1 1 YSOO 1010.* + *vx>°« + + 0' ‘0PHDR0XAXIS0YAXIS0210LIN0LIN01 1 100 1 
0 6 



**RESPLT BO CHARACTER 

M10XO< |RESY>00010.*+x*A°«+t0’ ’O' '0XAXIS0' |RESIDUALS| ' 0220LIN0LIN01 I 
100 1 0 00 



##RPLT 73 CHARACTER 

M0X10Y1 j YSOO 1010.* + *^AO< + t^"0PHDR0XAXIS0YAXIS02i0LIN0LIN01 1 100 1 
0 0 



»*SRESPLT 85 CHARACTER 

i»«10X0< |RE5Y>)YS0O 

l010.* + *^AO«4t0"0’ '0XAXIS0' |RESIDUALS| * 0220LIN0LIN01 1 1 00 1 0 
00 
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AEEJNDIX 3 
FORTRAN PROGRAMS 



This appendix contains a listing of the FORTRAN program 
and subroutine written to support this thesis. IKSL 
programs, IL3QF and PXSORT, used to support the LCWESS 
program are not listed. Detailed user instructions for oper- 
ating these programs are contained in Chapter V. 



SJOB C 

REAL 

X (20 0) , Y (200) , YS j20 0\/20 0*0. 0/,WXj(20 0^/200*1.0/, A (2,2^ X B (2, 1) 



C, 0 (20 0j >20b*0. b/.D. Ui.TX'(2bO) /2&0*0. 0/. WT'( 20'0)' /^6 o*’ 0:0> 
C. WK (22), DA.DB. B (20 0) /200*0.0/,R1 (200) /20 0*0. 0/, RU,?,C (4) 
c) , W, BETA (2.1) , BED C 
INTEGER 



AX, BX.A 1,0, II. 12,13, 14, 15, 16, 17, 18, 19, I 1 0, N, INK (2) ,IER, ROE 
C,IF1,IF2 C 

DATA AX/1/ ,ROB/-1/, N/0/ C 
F= . 33 
IF 1 = 2 
IF2=4 
N= 0 
N = N+ 1 

READ (IF1, 901, END=2) X(N) ,Y(N) 

GO TO 1 
N=N- 1 

CALI XYSORT (X, Y.1 ,N) 

Q=IFIX( (FLOAT (N)*F) +. 5) 

CONTINUE 
AX=1 

A 1= (AX- 1 ) 

BX=Q 

DO 65 11=1, N 
12 = 0 
D=0. 0 

DO 10 13= AX , EX 
12 = 12*1 



0 (I2)=X (II) -X (13) 
IF (.NOT. ABS (U (12) ) 



,GE. D) GO TO 5 



5 

10 



D = ABS (0 (12) ) 

CONTINOE 

CONTINUE 

IF (.NOT.D.GT. 0. 0000 1) GO TO 30 
DO 25 14 = 1, “ 



U 1 = AB S 





IF { . NOT. U 1 . 
TX (14) = ( 
WT (14) =T 




GO TO 20 


15 


CONTINUE 




oo 

II II 

MM 

MH 

Mts 


20 


CONTINUE 


25 


CONTINUE 



miM. 



0) GO TO 15 
l. U- (U 1**3) ) **3 
X (14) * W X (A 1+14) 
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30 



35 

40 



GO TO 40 
CONTINUE 

DO 3 5 15 = 1, Q 
TX (1 5 ) =1.0 
WT (I5)=WX ( A 1 +15) 
CONTINUE 
CONTINUE C 
A (1,1) =0. 0 
A 1 , 2 ) =0. 0 
A (2, 1} =0. 0 
A 2 , 2 ) =0.0 
B (1, 1) =0. 0 
B 2 , 1) =0.0 



DO 4 5 16=1. Q 
I7=A 1+16 
W=SQRT (WT (16 
A(1, 11 = A (1,1 



, 1 , 2 

2 , 2 



= A 
= A 



b ;i, i) =b 
B j2, 1) =B 
CONTINUE 



1,2 

2 2 

1 



+ W 
+ 

+ 

+ 



A (2. 1)=A(1, 2) C 

CALL LLBQF (A,2.2, 2,B, 2. 1 . 0 , C, 3ETA, 2, IWK , WK , IER) C 
YS (111= BETA (1,1) +BETA (2,1) *X(I1) 

CONTINUE 



IF (BX.GE.N) GC TO 60 
IF (II . GE. N) GC TO 60 
DA = X (I1 + 11-X (AX) 
DB=X (BX+ 1)-X (11 + 1) 



45 



50 



IF (.NOT. DA. G 
A X= AX + 1 
BX=BX+ 1 
GO TO 50 

55 CONTINUE 

60 CONTINUE 
A 1 = (AX- 1) 

65 CONTINUE C 
DO 70 18=1, N 

R (18) =Y (18) -YS (18) 
RT (18) =AB S (R (18) ) 
70 CONTINUE C 

CALL PXSORT (R 1 , 1, N) C 
L1=(N+1)/2 
L 2= (N + 2J /2 

MED= (R1 fl 1) +R1 (I2))/2 
DO 85 19=1, N 



X (17) *W) 

W* (X (17) **2) ) 

Y (I7)'*W) 

+ (Y (17) *X (17) *W) 



DB) GO TO 55 



0 
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IF ((R1 (I§) .GT.0.0) .AND. (ABS(MED) .GT.0.0)) GO TO 71 
' WX (1 9) = 1 . 0 
GO TO 80 

RU=R (19)/ (6-C*MED) 

IF (.NOT.ABS (ED) .LT. 1.0) GO TO 75 
WX (I9) = (i.0-(RU**2)) **2 



WX (19) = ( 

GO TO 80 

75 CONTINUE 

WX (19) =0.0 
80 CONTINUE C 

85 CONTINUE C TEST 

WRITE (6, 99 11 (WX (L) L=1 



N) 



991 FORMAT f ix;i6*7:jrc e&d ; TEST C 

FCE=ROB+1 C IF (.NOT. ROB. GE. 2) GO TO 

DO 9 0 1-1 0= 1 , N 

WRITE (IF2, 9 00) X (110) ,Y (110) ,YS (110) 
90 CONTINUE 
STOP 

900 FORMAT (1X.3F15.3) 

901 FCFMAT (2F1 5.3) 

END C 

SUBROUTINE XYSOFT (A, B , II , JJ) C 



75 



5 

10 



20 



30 



40 

50 



60 

70 

80 

90 



100 



DIMENSION 
M= 1 
I=II 
J=JJ 
IF (I 
K=I 
IJ= 

T 

T 1 

B 
A 
B 



A (JJ) ,B(JJ) # 10 (16) , IL (1 6) 



,GE. J) GO 10 70 



SliS ?' 2 

1 =B (IJ) 



ijhhn 

ij[=b|i5 
)=T 



LE. T) GO TO 20 



T1 



T= A (IJ) 
T1=B (IJ) 

[I J) =B 




GE. T) GO TO 40 



IJ =A I 
IJ)=B I 
I) =T 
I) =T 1 
T= A (IJ) 
T1=B (IJ) 
GO TO 40 
TT=A (L) 
TT1 = B (L) 



T) GO TO 40 







GT. T) GO TO 40 



IT. 



L, 



IF L-I 
IL M) =1 

isr =1 

M=M+1 
GO TO 80 
II (M) =K 
10 (M) =J 
J=1 
M=M+ 1 
GO TO 80 
M = M- 1 
IF JM .EQ. 
I=IL (M) 
J=IO (Mj 
IF (J-I 
IF 



LE. 



50 

TO 30 
GO TO 60 



T^ GO TO 
J-K) 



0) REIORN 



F(J-I .GE. 
Fjl^.EQ. i; 



i - - - IX > 

1 = 1+1 
IF (I .EQ. J) 
IF A (I) .LE. 

T = A (1+1) 

T 1 =B (1+1) 

K=I 

A (K+ 1 ) =A 
B (K+1) =B 



1 1 ) GO TO 10 
GC TO 5 



GC TO 70 
A (1+1) ) GO TO 



90 



76 



K=K-1 
IF (T .LT. 

A (K+1)=T 
B K+ 1) =T 1 
GO TO 90 
END SENTRY 



A (K) ) GO TO 



100 



The following LOWS EXEC routine sets the file defini- 
tions and invokes the appropriate systems libraries required 
to execute LOWESS. This routine is executed by typing "LOWS 
EXEC." 



GLOBAL 

FILEDEF 

EILEDEE 

FILEDEF 

FILEDEF 

FILEDEF 



MACLIB IP.SLSP 

02 DISK LOW2 

03 DISK LOW 3 

04 DISK LOW 4 

07 DISK LOW? 

08 DISK LOW 8 



NONIMSL 
DATA A 
DATA A 
DATA A 
DATA A 
DATA A 



PERM 

PERM 

PERM 

PERM 

PERM 
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APPEND IX C 
DATA SETS 

This appendix contains four data sets that were used to 
compare LONS SS with MOVING AVERAGE/ COSINE ARCH and LEAST 
SQUARES REGRESSION rooutines in Chapter III. They include: 

1/ TEST SET ONE ... used to test LOWESS’ ability to 
detect and follow linear trends. 

2. TEST SET TWO ... used to check LOWESS' performance on 
data sets that contain abrupt changes in curvature. 

3. TEST SET THREE ... used to test LOWESS' ability to 
fellow smooth changes in curvature. 

4. Lag-1 points from NEAR (1) data ... used to check 
LCWESS ' performance on unequally spaced data. 
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TABLE IX 
Data Set One 



X 


Y 


X 


.200 


“.398 


10.200 


.400 


“.81 1 


10.400 


.600 


“.103 


10.600 


.800 


1.156 


10.800 


1 .000 


1 .653 


1 1 .000 


1 .200 


1.416 


1 1 .200 


1 .400 


1.136 


11.400 


1 .600 


3.402 


1 1 .600 


1 .800 


1 .157 


11.800 


2.000 


2.110 


12.000 


2.200 


1 .481 


12.200 


2.400 


2.821 


12.400 


2.600 


.669 


12.600 


2.800 


’ 3.460 


12.800 


3.000 


1 .897 


13.000 


3.200 


3.097 


13.200 


3.400 


2.340 


13.400 


3.600 


2.361 


13.600 


3.800 


1.911 


13.800 


4.000 


3.026 


14.000 


4.200 


4.412 


14.200 


4.400 


4.893 


14.400 


4.600 


6. 1 47 


14.600 


4.800 


5.445 


14.800 


5.000 


2.852 


15.000 


5.200 


4.171 


15.200 


5.400 


5.258 


15.400 


5.600 


3.073 


15.600 


5.800 


5.487 


15.800 


6.000 


5.406 


16.000 


6.200 


6.532 


16.200 


6.400 


6.959 


16.400 


6.600 


7.500 


16.600 


6.800 


6.599 


16.800 


7.000 


6.766 


17.000 


7.200 


8.650 


17.200 


7.400 


9.236 


17.400 


7.600 


7.217 


17.600 


7.800 


7.955 


17.800 


8.000 


7.035 


18.000 


8.200 


8.239 


18.200 


8.400 


9.165 


18.400 


8.600 


8.005 


18.600 


8.800 


8.930 


18.800 


9.000 


9.035 


19.000 


9.200 


8.575 


19.200 


9.400 


8.860 


19.400 


9.600 


1 1 .480 


19.600 


9.800 


8.796 


19.800 


10.000 


9.503 


20.000 



Y 


X 


Y 


8.696 


20.200 


21.520 


10.305 


20.400 


19.996 


10.997 


20.600 


21.018 


10.273 


20.800 


21 .047 


1 1.345 


21 .000 


21 .704 


10.477 


21 .200 


21 .832 


12.668 


21 .400 


20.408 


1 1 .569 


21 .600 


23.367 


12.578 


21 .800 


21.418 


14.180 


22.000 


21 .089 


12.638 


22.200 


21 .204 


13.733 


22.400 


23.595 


12.851 


22.600 


22.441 


12.490 


22.800 


25.504 


12.077 


23.000 


22.802 


12.815 


23.200 


23.059 


14.558 


23.400 


23.811 


14.463 


23.600 


22.421 


12.765 


23.800 


23.522 


13.807 


24.000 


22.419 


12.900 


24.200 


25.249 


14.707 


24.400 


24.703 


15.569 


24.600 


23.373 


14.053 


24.800 


24.870 


12.204 


25.000 


24.603 


15.897 


25.200 


26.589 


18.607 


25.400 


26.764 


16.136 


25.600 


26.258 


16.098 


25.000 


26.291 


16.284 


26.000 


26.801 


17.160 


26.200 


25.433 


18.488 


26.400 


26.764 


18.125 


26.600 


26.202 


16.605 


26.800 


27.664 


17.017 


27.000 


26.822 


17.446 


27.200 


29.074 


16.546 


27.400 


27.572 


18.758 


27.600 


28.872 


17.962 


27.800 


27.765 


19.557 


28.000 


26.499 


18.006 


28.200 


28.565 


20.051 


28.400 


28.201 


16.701 


28.600 


27.210 


20.623 


28.800 


29.029 


17.482 


29.000 


29.271 


18.149 


29.200 


28.834 


19.450 


29.400 


30.777 


18.145 


29.600 


28.802 


20.267 


29.800 


28.863 


20.545 


30.000 


29.998 
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TABLE X 

Data Set Two 
X 



x 

.200 
.400 
. 600 
.800 
1 .000 
1 .200 
1 .400 

1 .600 
1 .800 

2.000 

2.200 

2.400 

2.600 

2.800 

3.000 

3.200 

3.400 

3.600 

3.800 

4.000 

4.200 

4.400 

4.600 

4.800 

5.000 

5.200 

5.400 

5.600 

5.800 

6.000 

6.200 

6.400 

6.600 

6.800 

7.000 

7.200 

7.400 

7.600 

7.800 

8.000 

8.200 

8.400 

8.600 

8.800 

9.000 

9.200 

9.400 

9.600 

9.800 

10.000 

10.200 

10.400 

10.600 

10.800 

1 1 .000 



Y 

".462 
“ 2.191 
1.405 
.947 
.475 
.832 
".137 
2.336 
.779 
2.597 
1.144 
1.832 
“.406 
.419 
2.446 
.641 
1 .937 
1 .080 
1 .384 
.251 
.410 
2.745 
1 .795 
1.121 
1 .235 
2.942 
2.104 
2.753 
2.717 
3.156 
2.880 
1 .219 

3.01 5 
3.845 
3.529 
.503 
2.686 
2.717 
3.438 
2.689 
3.278 
4.967 
4.288 
3.788 
2.677 
3.610 
3.908 
3.283 
3.583 
4.415 
5.578 
1 .596 
2.962 
5.203 
4.682 



X 

1 1 .200 

1 1 .400 

1 1 .600 

11.800 

12.000 

12.200 

12.400 

12.600 

12.800 

13.000 

13.200 

13.400 

13.600 

13.800 

14.000 

14.200 

14.400 

14.600 

14.800 

15.000 

15.200 

15.400 

15.600 

15.800 

16.000 

16.200 

16.400 

16.600 

16.800 

17.000 

17.200 

17.400 

17.600 

17.800 

18.000 

18.200 

18.400 

18.600 

18.800 

19.000 

19.200 

19.400 

19.600 

19.800 

20.000 

20.200 

20.400 

20.600 

20.800 

21 .000 

21 .200 

21.400 

21 .600 

21 .800 

22.000 



Y 

3.849 
4.554 
3.182 
3.159 
4.518 
5.736 
4.989 
3.752 
5.165 
4.052 
3.594 
3.895 
3.747 
4.171 
4.962 
3.356 
4.792 
5.593 
4.630 
5.203 
4.468 
6.558 
5.484 
2.766 
4.635 
2.81 2 
5.668 
5.055 
5.319 
5.574 
6.472 
4.420 
4.623 

5.396 
5.778 
3.705 
4.290 
4.900 

2.397 
6.059 
3.894 
6.093 
4.174 
5.615 
5.820 
4.844 
5.602 
4.933 
5.634 
4.003 
4.389 
6.545 
4.540 
5.417 
3.613 



22.200 

22.400 

22.600 

22.800 

23.000 

23.200 

23.400 

23.600 

23.800 

24.000 

24.200 

24.400 

24.600 

24.800 

25.000 

25.200 

25.400 

25.600 

25.800 

26.000 

26.200 

26.400 

26.600 

26.800 

27.000 

27.200 

27.400 

27.600 

27.800 

28.000 

28.200 

28.400 

28.600 

28.800 

29.000 

29.200 

29.400 

29.600 

29.800 

30.000 

30.200 

30.400 

30.600 

30.800 

31 .000 

31 .200 

31 .400 

31.600 
31 . 800 . 

32.000 

32.200 

32.400 

32.600 

32.800 

33.000 



Y 

4.819 
4.469 
4.997 
6.256 
6.278 
6.490 
5.499 
5.860 
4.325 
4.949 
6.690 
6.339 
5.899 
4.233 
5.825 
5.742 
4.873 
5.497 
7.697 
4.600 
3.374 
2.242 
4.078 
4.090 
3.519 
6.651 
5.513 
5.141 
4.818 
1 .451 
5.936 
4.205 
3.202 
1.977 
4.046 
5.971 
4.175 
4.583 
3.479 
4.621 
1 .989 
4.408 
3.896 
3.112 
3.422 
4.740 
3.108 
3.892 
1 .630 
4.039 
4.600 
2.125 
1.625 
1 .602 
3.180 



X 

33.200 

33.400 

33.600 

33.800 

34.000 

34.200 

34.400 

34.600 

34.800 

35.000 

35.200 

35.400 

35.600 

35.800 

36.000 

36.200 

36.400 

36.600 

36.800 

37.000 

37.200 

37.400 

37.600 

37.800 

38.000 

38.200 

38.400 

38.600 

38.800 

39.000 

39.200 

39.400 

39.600 

39.800 

40.000 

40.200 

40.400 

40.600 

40.800 

41 .000 

41 .200 

41 .400 

41 .600 

41 .800 

42.000 

42.200 

42.400 

42.600 

42.800 

43.000 

43.200 

43.400 

43.600 

43.800 

44.000 



Y 

1 .657 
2.245 
.862 
3.226 
1 .362 
2.923 
2.736 
1 .736 
2.129 
1 .433 
1 .313 
2.756 
1.576 
.363 
2.955 
.266 
1 .664 
.323 
.783 
1.419 
1 .997 
.533 
1.137 
.506 
.671 
“.612 
.376 
1 .921 
“.476 
“ 1.014 
1 .788 
1 .306 
.853 
“ 1 .468 
1 .554 
“.542 
“ 2.351 
1.165 
.627 
.075 
.352 
“.697 
1 .696 
.059 
1 .797 
.264 
.872 
“1 .446 
“.701 
1 .246 
“.639 
.577 
“.360 
“.136 
“ 1 .349 
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TABLE 

Data Set 



X 


Y 


X 


.063 


.261 


2.135 


.126 


".129 


2.198 


.188 


.053 


2.261 


.251 


~.293 


2.324 


.314 


1 .316 


2.386 


.377 


1 .340 


2.449 


.440 


-.335 


2.512 


.502 


1 .451 


2.575 


.565 


.088 


2.638 


.628 


.435 


2.700 


.691 


.915 


2.763 


.754 


.522 


2.826 


.81 6 


1 .398 


2.889 


.879 


1 .381 


2.952 


.942 


.01 1 


3.014 


1 .005 


.310 


3.077 


1.068 


.496 


3.140 


1.130 


1.115 


3.203 


1.193 


.713 


3.266 


1 .256 


1 .304 


3.328 


1 .319 


1 .082 


3.391 


1.382 


.474 


3.454 


1 . 444 


1 .062 


3.517 


1 .507 


.624 


3.580 


1 .570 


.686 


3.642 


1 .633 


1 .695 


3.705 


1 .696 


.168 


3.768 


1 .758 


-.025 


3.831 


1.821 


1.215 


3.894 


1 .884 


.174 


3.956 


1 .947 


.860 


4.019 


2.010 


1 .028 


4.082 


2.072 


.743 


4.145 



II 






Three 


Y 


X 


Y 


.560 


4.208 


" 1 .733 


.716 


4.270 


-.860 


1.376 


4.333 


.049 


.410 


4.396 


-.870 


.988 


4.459 


“ 1 .282 


.326 


4.522 


" 1 .701 


.875 


4.584 


" 1 .025 


.175 


4.647 


-.81 1 


1 .079 


4.710 


-.891 


.520 


4.773 


" 1 .088 


1.167 


4.836 


-.980 


.471 


4.898 


".662 


.684 


4.961 


".508 


.835 


5.024 


" 1 .729 


.344 


5.087 


-.599 


-.129 


5.150 


~ 1 .21 1 


-.055 


5.212 


-.595 


-.543 


5.275 


“ 1.151 


“ 1.152 


5.338 


-.195 


-.111 


5.401 


-.275 


.024 


5.464 


" 1.133 


-.180 


5.526 


-.982 


-.520 


5.589 


.206 


-.633 


5.652 


-.113 


.088 


5.715 


" 1.503 


".339 


5.778 


-.228 


.216 


5.840 


' -.232 


-.223 


5.903 


-.824 


.052 


5.966 


".949 


" 1.417 


6.029 


-.078 


-.899 


6.092 


-.788 


“.310 


6.154 


.205 


.074 


6.217 


-.100 
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TABLE III 



Lag-1 Data derived 



X 


Y 


X 


Y 


1 .020 


.466 


.871 


.822 


.035 


1 .020 


.747 


.871 


.129 


.035 


1 .385 


.747 


.125 


• .129 


1.189 


1 .385 


.153 


.125 


.017 


1.189 


.233 


.153 


.261 


.017 


2.077 


.233 


.366 


.261 


2.155 


2.077 


.349 


.366 


1.821 


2.155 


.364 


.349 


.042 


1 .821 


1.140 


.364 


.036 


.042 


1 .020 


1.140 


.061 


.036 


3.508 


1 .020 


.149 


.061 


3.122 


3.508 


4.260 


.149 


2.623 


3.122 


4.095 


4.260 


2.654 


2.623 


3.422 


4.095 


.209 


2.654 


2.854 


3.422 


.255 


.209 


2.609 


2.854 


.271 


.255 


2.176 


2.609 


1.185 


.271 


1 .823 


2.176 


.989 


1 .185 


1.617 


1.823 


2.867 


.989 


2.439 


1 .617 


2.488 


2.867 


2.047 


2.439 


2.086 


2.488 


1 .840 


2.047 


1 .756 


2.086 


3.049 


1 .840 


1 .530 


1 .756 


2.682 


3.049 


1 .456 


1 .530 


2.239 


2.682 


.180 


1 .456 


1 .889 


2.239 


.429 


.180 


1 .577 


1 .889 


.031 


.429 


1 .664 


1 .577 


2.951 


.031 


.103 


1 .664 


2.565 


2.951 


.133 


.103 


2.133 


2.565 


. 1 45 


.133 


3.737 


2.1 33 


.207 


. 1 45 


3.180 


3.737 


.221 


.207 


2.675 


3.180 


.196 


.221 


2.307 


2.675 


.170 


.196 


1 .996 


2.307 


.185 


. 1 70 


1 .892 


1 .996 


.087 


.185 


1 .700 


1 .892 


2.258 


.087 


1 .716 


1 .700 


1.938 


2.258 


1.599 


1 .716 


1 .617 


1 .938 


1 .498 


1 .599 


1 .346 


1 .617 


1 .247 


1.498 


1 .184 


1 .346 


.044 


1.247 


1 .007 


1.184 


.306 


.044 


.853 


1 .007 


.255 


.306 


.779 


.853 


.258 


.255 


.727 


.779 


.519 


.258 


.822 


.727 


.650 


.519 



m NE AR ( 1) Process 



X 


Y 


X 


Y 


563 


.650 


.313 


.304 


049 


.563 


.376 


.313 


133 


.049 


.329 


.376 


334 


.133 


.363 


.329 


596 


.334 


.556 


.363 


604 


.596 


.655 


.556 


527 


.604 


.544 


.655 


934 


.527 


.569 


.544 


797 


.934 


.531 


.569 


496 


1 .797 


.518 


.531 


420 


1 .496 


.584 


.518 


522 


1 .420 


4.292 


.584 


353 


1 .522 


3.610 


4.292 


187 


1 .353 


4.074 


3.61 0 


050 


1.187 


3.492 


4.074 


898 


1 .050 


3.644 


3.492 


854 


.898 


3.147 


3.644 


631 


.854 


.022 


3.1 47 


363 


1 .631 


.330 


.022 


172 


1 .363 


.310 


.330 


303 


1.172 


.597 


.310 


229 


1 .303 


.551 


.597 


061 


1 .229 


.544 


.551 


962 


1 .061 


.817 


.544 


907 


.962 


.808 


.81 7 


856 


.907 


.715 


.808 


1 35 


.856 


.601 


.715 


953 


1.135 


.618 


.601 


728 


.953 


1 .525 


.618 


010 


1 .728 


1 .526 


1 .525 


073 


.010 


1 .279 


1 .526 


082 


.073 


1 .065 


1 .279 


096 


.082 


.929 


1 .065 


098 


.096 


.81 4 


.929 


234 


.098 


.703 


.81 4 


046 


.234 


.704 


.703 


017 


1 .046 


.898 


.704 


239 


1.017 


.785 


.898 


105 


1 .239 


1 .065 


.785 


124 


.105 


.995 


1 .065 


< nn 

1 W te 


.124 


3.157 


.995 


122 


.122 


2.710 


3.157 


154 


.122 


2.265 


2.710 


165 


.154 


1 .883 


2.265 


205 


.165 


1 .566 


1 .883 


Y90 


.205 


1 .488 


1 .566 


.315 


. 1 90 


1 .268 


1 .488 


.335 


.315 


1 .206 


1 .268 


.304 


.335 


2.825 


1 .206 



fro 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 
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